A Python module to bypass Cloudflare's anti-bot page.

Overview

cloudscraper

PyPI version License: MIT image Build Status Donate

A simple Python module to bypass Cloudflare's anti-bot page (also known as "I'm Under Attack Mode", or IUAM), implemented with Requests. Cloudflare changes their techniques periodically, so I will update this repo frequently.

This can be useful if you wish to scrape or crawl a website protected with Cloudflare. Cloudflare's anti-bot page currently just checks if the client supports Javascript, though they may add additional techniques in the future.

Due to Cloudflare continually changing and hardening their protection page, cloudscraper requires a JavaScript Engine/interpreter to solve Javascript challenges. This allows the script to easily impersonate a regular web browser without explicitly deobfuscating and parsing Cloudflare's Javascript.

For reference, this is the default message Cloudflare uses for these sorts of pages:

Checking your browser before accessing website.com.

This process is automatic. Your browser will redirect to your requested content shortly.

Please allow up to 5 seconds...

Any script using cloudscraper will sleep for ~5 seconds for the first visit to any site with Cloudflare anti-bots enabled, though no delay will occur after the first request.

Donations

If you feel like showing your love and/or appreciation for this project, then how about shouting me a coffee or beer :)

Buy Me A Coffee

Installation

Simply run pip install cloudscraper. The PyPI package is at https://pypi.python.org/pypi/cloudscraper/

Alternatively, clone this repository and run python setup.py install.

Dependencies

python setup.py install will install the Python dependencies automatically. The javascript interpreters and/or engines you decide to use are the only things you need to install yourself, excluding js2py which is part of the requirements as the default.

Javascript Interpreters and Engines

We support the following Javascript interpreters/engines.

Updates

Cloudflare modifies their anti-bot protection page occasionally, So far it has changed maybe once per year on average.

If you notice that the anti-bot page has changed, or if this module suddenly stops working, please create a GitHub issue so that I can update the code accordingly.

  • Many issues are a result of users not updating to the latest release of this project. Before filing an issue, please run the following command:
pip show cloudscraper

If the value of the version field is not the latest release, please run the following to update your package:

pip install cloudscraper -U

If you are still encountering a problem, open an issue and please include:

  • The full exception and stack trace.
  • The URL of the Cloudflare-protected page which the script does not work on.
  • A Pastebin or Gist containing the HTML source of the protected page.
  • The version number from pip show cloudscraper.

Usage

The simplest way to use cloudscraper is by calling create_scraper().

import cloudscraper

scraper = cloudscraper.create_scraper()  # returns a CloudScraper instance
# Or: scraper = cloudscraper.CloudScraper()  # CloudScraper inherits from requests.Session
print(scraper.get("http://somesite.com").text)  # => "<!DOCTYPE html><html><head>..."

That's it...

Any requests made from this session object to websites protected by Cloudflare anti-bot will be handled automatically. Websites not using Cloudflare will be treated normally. You don't need to configure or call anything further, and you can effectively treat all websites as if they're not protected with anything.

You use cloudscraper exactly the same way you use Requests. cloudScraper works identically to a Requests Session object, just instead of calling requests.get() or requests.post(), you call scraper.get() or scraper.post().

Consult Requests' documentation for more information.

Options

Brotli

Description

Brotli decompression support has been added, and it is enabled by default.

Parameters

Parameter Value Default
allow_brotli (boolean) True

Example

scraper = cloudscraper.create_scraper(allow_brotli=False)

Browser / User-Agent Filtering

Description

Control how and which User-Agent is "randomly" selected.

Parameters

Can be passed as an argument to create_scraper(), get_tokens(), get_cookie_string().

Parameter Value Default
browser (string) chrome or firefox None

Or

Parameter Value Default
browser (dict)
browser dict Parameters
Parameter Value Default
browser (string) chrome or firefox None
mobile (boolean) True
desktop (boolean) True
platform (string) 'linux', 'windows', 'darwin', 'android', 'ios' None
custom (string) None

Example

scraper = cloudscraper.create_scraper(browser='chrome')

or

# will give you only mobile chrome User-Agents on Android
scraper = cloudscraper.create_scraper(
    browser={
        'browser': 'chrome',
        'platform': 'android',
        'desktop': False
    }
)

# will give you only desktop firefox User-Agents on Windows
scraper = cloudscraper.create_scraper(
    browser={
        'browser': 'firefox',
        'platform': 'windows',
        'mobile': False
    }
)

# Custom will also try find the user-agent string in the browsers.json,
# If a match is found, it will use the headers and cipherSuite from that "browser",
# Otherwise a generic set of headers and cipherSuite will be used.
scraper = cloudscraper.create_scraper(
    browser={
        'custom': 'ScraperBot/1.0',
    }
)

Debug

Description

Prints out header and content information of the request for debugging.

Parameters

Can be set as an attribute via your cloudscraper object or passed as an argument to create_scraper(), get_tokens(), get_cookie_string().

Parameter Value Default
debug (boolean) False

Example

scraper = cloudscraper.create_scraper(debug=True)

Delays

Description

Cloudflare IUAM challenge requires the browser to wait ~5 seconds before submitting the challenge answer, If you would like to override this delay.

Parameters

Can be set as an attribute via your cloudscraper object or passed as an argument to create_scraper(), get_tokens(), get_cookie_string().

Parameter Value Default
delay (float) extracted from IUAM page

Example

scraper = cloudscraper.create_scraper(delay=10)

Existing session

Description:

If you already have an existing Requests session, you can pass it to the function create_scraper() to continue using that session.

Parameters

Parameter Value Default
sess (requests.session) None

Example

session = requests.session()
scraper = cloudscraper.create_scraper(sess=session)

Note

Unfortunately, not all of Requests session attributes are easily transferable, so if you run into problems with this,

You should replace your initial session initialization call

From:

sess = requests.session()

To:

sess = cloudscraper.create_scraper()

JavaScript Engines and Interpreters

Description

cloudscraper currently supports the following JavaScript Engines/Interpreters

Parameters

Can be set as an attribute via your cloudscraper object or passed as an argument to create_scraper(), get_tokens(), get_cookie_string().

Parameter Value Default
interpreter (string) native

Example

scraper = cloudscraper.create_scraper(interpreter='nodejs')

3rd Party Captcha Solvers

Description

cloudscraper currently supports the following 3rd party Captcha solvers, should you require them.

Note

I am working on adding more 3rd party solvers, if you wish to have a service added that is not currently supported, please raise a support ticket on github.

Required Parameters

Can be set as an attribute via your cloudscraper object or passed as an argument to create_scraper(), get_tokens(), get_cookie_string().

Parameter Value Default
captcha (dict) None

2captcha

Required captcha Parameters
Parameter Value Required Default
provider (string) 2captcha yes
api_key (string) yes
no_proxy (boolean) no False
Note

if proxies are set you can disable sending the proxies to 2captcha by setting no_proxy to True

Example
scraper = cloudscraper.create_scraper(
  interpreter='nodejs',
  captcha={
    'provider': '2captcha',
    'api_key': 'your_2captcha_api_key'
  }
)

anticaptcha

Required captcha Parameters
Parameter Value Required Default
provider (string) anticaptcha yes
api_key (string) yes
no_proxy (boolean) no False
Note

if proxies are set you can disable sending the proxies to anticaptcha by setting no_proxy to True

Example
scraper = cloudscraper.create_scraper(
  interpreter='nodejs',
  captcha={
    'provider': 'anticaptcha',
    'api_key': 'your_anticaptcha_api_key'
  }
)

CapMonster Cloud

Required captcha Parameters
Parameter Value Required Default
provider (string) capmonster yes
clientKey (string) yes
no_proxy (boolean) no False
Note

if proxies are set you can disable sending the proxies to CapMonster by setting no_proxy to True

Example
scraper = cloudscraper.create_scraper(
  interpreter='nodejs',
  captcha={
    'provider': 'capmonster',
    'clientKey': 'your_capmonster_clientKey'
  }
)

deathbycaptcha

Required captcha Parameters
Parameter Value Required Default
provider (string) deathbycaptcha yes
username (string) yes
password (string) yes
Example
scraper = cloudscraper.create_scraper(
  interpreter='nodejs',
  captcha={
    'provider': 'deathbycaptcha',
    'username': 'your_deathbycaptcha_username',
    'password': 'your_deathbycaptcha_password',
  }
)

9kw

Required captcha Parameters
Parameter Value Required Default
provider (string) 9kw yes
api_key (string) yes
maxtimeout (int) no 180
Example
scraper = cloudscraper.create_scraper(
  interpreter='nodejs',
  captcha={
    'provider': '9kw',
    'api_key': 'your_9kw_api_key',
    'maxtimeout': 300
  }
)

return_response

Use this if you want the requests response payload without solving the Captcha.

Required captcha Parameters
Parameter Value Required Default
provider (string) return_response yes
Example
scraper = cloudscraper.create_scraper(
  interpreter='nodejs',
  captcha={'provider': 'return_response'}
)

Integration

It's easy to integrate cloudscraper with other applications and tools. Cloudflare uses two cookies as tokens: one to verify you made it past their challenge page and one to track your session. To bypass the challenge page, simply include both of these cookies (with the appropriate user-agent) in all HTTP requests you make.

To retrieve just the cookies (as a dictionary), use cloudscraper.get_tokens(). To retrieve them as a full Cookie HTTP header, use cloudscraper.get_cookie_string().

get_tokens and get_cookie_string both accept Requests' usual keyword arguments (like get_tokens(url, proxies={"http": "socks5://localhost:9050"})).

Please read Requests' documentation on request arguments for more information.


User-Agent Handling

The two integration functions return a tuple of (cookie, user_agent_string).

You must use the same user-agent string for obtaining tokens and for making requests with those tokens, otherwise Cloudflare will flag you as a bot.

That means you have to pass the returned user_agent_string to whatever script, tool, or service you are passing the tokens to (e.g. curl, or a specialized scraping tool), and it must use that passed user-agent when it makes HTTP requests.


Integration examples

Remember, you must always use the same user-agent when retrieving or using these cookies. These functions all return a tuple of (cookie_dict, user_agent_string).


Retrieving a cookie dict through a proxy

get_tokens is a convenience function for returning a Python dict containing Cloudflare's session cookies. For demonstration, we will configure this request to use a proxy. (Please note that if you request Cloudflare clearance tokens through a proxy, you must always use the same proxy when those tokens are passed to the server. Cloudflare requires that the challenge-solving IP and the visitor IP stay the same.)

If you do not wish to use a proxy, just don't pass the proxies keyword argument. These convenience functions support all of Requests' normal keyword arguments, like params, data, and headers.

import cloudscraper

proxies = {"http": "http://localhost:8080", "https": "http://localhost:8080"}
tokens, user_agent = cloudscraper.get_tokens("http://somesite.com", proxies=proxies)
print(tokens)
# => {
    'cf_clearance': 'c8f913c707b818b47aa328d81cab57c349b1eee5-1426733163-3600',
    '__cfduid': 'dd8ec03dfdbcb8c2ea63e920f1335c1001426733158'
}

Retrieving a cookie string

get_cookie_string is a convenience function for returning the tokens as a string for use as a Cookie HTTP header value.

This is useful when crafting an HTTP request manually, or working with an external application or library that passes on raw cookie headers.

import cloudscraper

cookie_value, user_agent = cloudscraper.get_cookie_string('http://somesite.com')

print('GET / HTTP/1.1\nCookie: {}\nUser-Agent: {}\n'.format(cookie_value, user_agent))

# GET / HTTP/1.1
# Cookie: cf_clearance=c8f913c707b818b47aa328d81cab57c349b1eee5-1426733163-3600; __cfduid=dd8ec03dfdbcb8c2ea63e920f1335c1001426733158
# User-Agent: Some/User-Agent String

curl example

Here is an example of integrating cloudscraper with curl. As you can see, all you have to do is pass the cookies and user-agent to curl.

import subprocess
import cloudscraper

# With get_tokens() cookie dict:

# tokens, user_agent = cloudscraper.get_tokens("http://somesite.com")
# cookie_arg = 'cf_clearance={}; __cfduid={}'.format(tokens['cf_clearance'], tokens['__cfduid'])

# With get_cookie_string() cookie header; recommended for curl and similar external applications:

cookie_arg, user_agent = cloudscraper.get_cookie_string('http://somesite.com')

# With a custom user-agent string you can optionally provide:

# ua = "Scraping Bot"
# cookie_arg, user_agent = cloudscraper.get_cookie_string("http://somesite.com", user_agent=ua)

result = subprocess.check_output(
    [
        'curl',
        '--cookie',
        cookie_arg,
        '-A',
        user_agent,
        'http://somesite.com'
    ]
)

Trimmed down version. Prints page contents of any site protected with Cloudflare, via curl.

Warning: shell=True can be dangerous to use with subprocess in real code.

url = "http://somesite.com"
cookie_arg, user_agent = cloudscraper.get_cookie_string(url)
cmd = "curl --cookie {cookie_arg} -A {user_agent} {url}"
print(
    subprocess.check_output(
        cmd.format(
            cookie_arg=cookie_arg,
            user_agent=user_agent,
            url=url
        ),
        shell=True
    )
)
Comments
  • Ugly fix for fake jsch_vc, pass params

    Ugly fix for fake jsch_vc, pass params

    They started to include second fake form with bad params that we have to ignore. Challenge html code: https://gist.github.com/oczkers/b4f7408e81c70b9b32643690d2caf19e website: https://takefile.link

    OrderedDict uses only last value when there are duplicate keys so we ended up with jschl_vc=1, pass="" I've fixed it by reversing list before converting list->OrderedDict so now it uses first seen values instead of last seen. It worked for this site but can be easly changed in future probably so this is ugly fix and You probably don't want to merge this - we should use sth more bulletproof like loop checking params one by one or cutting part of html code before regex etc.

    opened by oczkers 10
  • Allow replacing actual call to perform HTTP request via subclassing

    Allow replacing actual call to perform HTTP request via subclassing

    Basically, I have my own urllib wrapper that I'm maintaining almost entirely out of spite. It has it's own cookie/UA/header/etc... management, and I'd like to be able to just wrap that instead of having to move things back and forth between it and the requests session continuously.

    This change basically moves the actual calls to the parent super().request() call into a stub function, so I can subclass CloudScraper(), and then just replace the body of perform_request() with my own HTTP fetching machinery.

    I'm not sure this is something of interest to really anyone other then myself, but it's also a really simple change (and could potentially be useful for testing purposes/mocking as well). Being able to plug in any arbitrary HTTP(s) transport seems like a nice feature too.

    opened by fake-name 6
  • Added parameter to change the type of encryption used

    Added parameter to change the type of encryption used

    I was having problems to perform the handshake with some servers because it is using 384bit encryption, so I found a type that solves my problem the "secp384r1". I added the possibility for the user to choose the best algorithm for each use.

    The main problem I had was handshake errors like: (Caused by SSLError(SSLError(1, '[SSL: SSLV3_ALERT_HANDSHAKE_FAILURE]) sslv3 alert handshake failure (_ssl.c:1108)'))

    opened by felipehertzer 4
  • fix: Add missing comma in `get_tokens`

    fix: Add missing comma in `get_tokens`

    • This comma has been most probably been left out unintentionally, leading to string concatenation between the two consecutive lines. This issue has been found automatically using a regular expression.
    opened by mrshu 2
  • Async integration

    Async integration

    As it is not possible to create issues I'll ask here.

    I am coming from aiocfscrape which was an async approach/reimplementation of cfscrape. cfscrape seems to be dead nowadays. Thus aiocfscrape would now do the bypassing by itself or rebasing on a new project. Either way, it would need to be rewritten.

    But as you seem to be fond of supporting various environments (eg. multiple different JS engine and captcha services).

    Thus I propose to add async support with aiohttp directly to this repo instead of leeching off this one.

    Architecturally I'd put the different implementations (requests, aiohttp) similarly as the JS engine and captcha service into one place, where then the user can say he wants either one of them. The difference would be however that the user can tell the session async=True and it'll then get the async implementation instead of the requests one. Another way would be to just create a new module and tell the user to import from ...async.CloudScraper instead. This would also need second implementations of eg. the node js engine as we'd have to use async subprocesses instead of the usual one.

    This would also mean the python version compatibility wouldn't be 3.x but rather at least 3.5.x or rather even 3.6 as 3.5 actually reached its end of life.

    I'd be glad to create/maintain the async implementation.

    opened by Nachtalb 2
  • Use requests.utils.extract_zipped_paths() to read browsers.json

    Use requests.utils.extract_zipped_paths() to read browsers.json

    I was packaging cloudscraper and requests in a zip file and had kludged a way to read browsers.json, when I found that requests already had a better solution that it uses to read certifi.cacert.pem.

    I applied it to cloudscraper and thought I'd at least offer it to you. It's up to you, of course, whether you find this useful or not.

    Thanks for making cloudscraper available.

    opened by JimmXinu 1
  • Some fix in regex to pass a cloudflare

    Some fix in regex to pass a cloudflare

    I had issues with a cloudflare (I add it in tests folder), because there is a class in form markup or spaces that make the parsing wrong.

    For information, I success pass this cloudflare only with the js2py, there was errors with native (I had a loop, so i think the result of challeng is wrong).

    opened by aroquemaurel 1
  • setup: exclude tests for default install

    setup: exclude tests for default install

    We probably don't need install tests for "normal" users and this is required to get gentoo ebuild working (package manager).

    ps. You forgot to push new release/archive on github - latest is 1.2.9

    opened by oczkers 1
  • Add tests

    Add tests

    I made a couple of necessary fixes to pass some tests and a couple are being skipped for the time being. If running tox and you have .tox cache, you'll need to remove it to refresh dependencies Unable to use make ci on travi-ci atm, related to https://github.com/pytest-dev/pytest-xdist/issues/187

    Coverage from the CI build: https://coveralls.io/github/pro-src/cloudscraper.py

    opened by ghost 1
  • Dev environment + CI configuration

    Dev environment + CI configuration

    • .coveragerc in preference of not passing certain CLI arguments around
    • Update .gitignore to exclude generated tests/coverage related files
    • Update .gitignore to exclude .idea/ and .env (Prevent accidental inclusion)
    • .travis.yml for testing on CI (Coverage reporting isn't configured for it yet)
    • Makefile for common tasks
    • Pipfile for dev-dependency management and CI builds
    • Pipfie.lock to lock dev-dependencies for CI builds
    • requirements.txt for some integrations and convenience
    • Fixed some typos
    • Updated dependencies
    • tests/test_cloudscraper.py is just a placeholder
    • tox.ini - tox configuration file for pre-release testing, etc.
    • Other dev-dependencies as required by future tests
    • Prefers coveralls for coverage reporting
    opened by ghost 1
  • CloudflareChallengeError: Detected a Cloudflare version 2 challenge

    CloudflareChallengeError: Detected a Cloudflare version 2 challenge

    Hello, i got this error : CloudflareChallengeError: Detected a Cloudflare version 2 challenge, This feature is not available in the opensource (free) version. Can you help me ?

    opened by bL34ch-exe 0
Releases(1.2.65)
  • 1.2.65(Nov 9, 2022)

  • 1.2.62(Aug 27, 2022)

  • 1.2.61(Aug 27, 2022)

  • 1.2.60(Mar 15, 2022)

  • 1.2.58(Apr 10, 2021)

  • 1.2.56(Jan 28, 2021)

  • 1.2.54(Jan 27, 2021)

  • 1.2.52(Jan 7, 2021)

  • 1.2.50(Dec 25, 2020)

  • 1.2.46(Jul 27, 2020)

    • Removed debug from 2captcha (ooops my bad).
    • Added in no_proxy to captcha parameters if you dont want to send proxy to 2captcha / anticaptcha.
    • Added in platform filtering to browser (User-Agent) via platform parameter.
    • added doubleDown parameter to control if re-request is to be performed when Captcha is detected.

    image

    Source code(tar.gz)
    Source code(zip)
  • 1.2.44(Jul 24, 2020)

  • 1.2.40(May 27, 2020)

    ~12 days have passed and Cloudflare updated again... they keeping to the schedule 👍

    image

    • Fixed: Cloudflare V1 challenge change (broke regex by introducing blank a.value).
    • Fixed: string -> float -> string causes issues in py2 str() rounding precision
    • Enhancement: Added Pre/Posting Hooking into request function.
    Source code(tar.gz)
    Source code(zip)
  • 1.2.38(May 16, 2020)

    • Update regex for new Cloudflare changes in numerous places.
    • Updated JSFuck challenge for new dynamic k variable.
    • Updated interpreters to account for new dynamic k allocation from subset list.

    image

    Source code(tar.gz)
    Source code(zip)
  • 1.2.36(May 4, 2020)

    • Update regex for Cloudflare form challenge
    • Overwrite auto_set_ecdh by manually setting elliptic curve
    • Rewrote native interpreter for JSFuck due to nested calculations
    • Added exception if new Cloudflare challenge detected.
    • Added support for hCaptcha in 9KW

    image

    Source code(tar.gz)
    Source code(zip)
  • 1.2.34(Apr 22, 2020)

    • Add ability for custom ssl context to be passed
    • Added new timer to anticaptcha module
    • Fixed Cloudflare's challenge form change
    • Removed DNT from headers causing reCaptcha on some sites
    • Updated cipher suite for browsers

    image

    Source code(tar.gz)
    Source code(zip)
  • 1.2.32(Apr 2, 2020)

  • 1.2.30(Mar 20, 2020)

  • 1.2.28(Mar 11, 2020)

    The good folks over at Cloudflare have changed something... yet again... and explicitly setting ALPN now causes challenge issues on Ubuntu and Windows.

    image

    Source code(tar.gz)
    Source code(zip)
  • 1.2.26(Mar 4, 2020)

  • 1.2.24(Feb 18, 2020)

    Just some refactoring / bug fixes

    • Refactored 302 Redirect on localized path with no schema.
    • @dipu-bd submitted PR for User_Agent.loadUserAgent() to close browser.json.

    Thanks to @Fran008 , @TheYoke @paulitap88 , @vrayv and anyone else I missed for raising the tickets and testing the dev branches for me ❤️

    cheers guys.

    Source code(tar.gz)
    Source code(zip)
  • 1.2.23(Feb 15, 2020)

  • 1.2.22(Feb 14, 2020)

  • 1.2.20(Jan 15, 2020)

    • Changed openSSL warning to a print instead of a raised exception.
    • Removed Brotli as a required dependency.
    • Updated cipher suites for User-Agents
    • Fixed a bug in matching custom User-Agents
    Source code(tar.gz)
    Source code(zip)
  • 1.2.18(Dec 25, 2019)

    Hohoho Merry Christmas. …

    • Improve / re-implement redirection support
    • Also support http -> https protocol scheme switch on challenge solve
    • Re-word Cloudflare 1020 block message
    • Add cookie test
    • Updated README.md

    image

    Source code(tar.gz)
    Source code(zip)
  • 1.2.16(Dec 12, 2019)

    Has a number of fixes

    • New native python Cloudflare challenge solver (now default interpreter).
    • Cloudflare sometimes redirects instead of passthrough after challenge solve, re-request if is redirect.
    • Alert/Raise Error if Cloudflare 1020 firewall block detected.
    • Removed the requirement for pyopenssl
    • Split out pytests into dev requirements
    • Added custom parameter to browser param for custom User-Agent
    • Added a tryMatch code in conjuction with custom parameter, which will try find and load ciphers && headers if custom is found in browsers.json otherwise a custom default will be used.
    Source code(tar.gz)
    Source code(zip)
  • 1.2.9(Nov 27, 2019)

    IUAM

    • Endpoints have changed to detect parameter __cf_chl_jschl_tk__ with UUID, for the challenge solve
    • Method is now a POST, no longer a GET
    • Parameter's have been removed, and are now instead data in the POST form

    reCaptcha

    • Changes in IUAM apply here as well as the additional listed below
    • Endpoints have changed to detect parameter __cf_chl_captcha_tk__ with UUID, for the challenge solve
    • New id param in payload added, id derived from CF-RAY header, which is also in the variable data-ray

    Testing

    • testing is disabled till I write some new tests.
    Source code(tar.gz)
    Source code(zip)
  • 1.2.8(Nov 12, 2019)

  • 1.2.7(Nov 6, 2019)

    • Removed cipher ECDHE-RSA-CHACHA20-POLY1305 to mitigate reCaptcha generation from Cloudflare
    Removed Nothinng -> RuntimeError ReCaptcha
    Removed Nothinng -> RuntimeError ReCaptcha
    Removed TLS_CHACHA20_POLY1305_SHA256 -> RuntimeError ReCaptcha
    Removed TLS_AES_128_GCM_SHA256 -> RuntimeError ReCaptcha
    Removed TLS_AES_256_GCM_SHA384 -> RuntimeError ReCaptcha
    Removed ECDHE-RSA-AES128-GCM-SHA256 -> 200
    Removed AES128-GCM-SHA256 -> 200
    Removed AES256-GCM-SHA384 -> 200
    Removed AES256-SHA -> 200
    Removed ECDHE-ECDSA-AES256-GCM-SHA384 -> 200
    Removed ECDHE-ECDSA-CHACHA20-POLY1305 -> 200
    Removed ECDHE-RSA-CHACHA20-POLY1305 -> 200
    Removed ECDHE-ECDSA-AES128-GCM-SHA256 -> 200
    Removed TLS_AES_128_GCM_SHA256 -> RuntimeError ReCaptcha
    Removed TLS_AES_256_GCM_SHA384 -> RuntimeError ReCaptcha
    Removed TLS_CHACHA20_POLY1305_SHA256 -> RuntimeError ReCaptcha
    Removed ECDHE-ECDSA-AES128-GCM-SHA256 -> 200
    Removed ECDHE-ECDSA-AES256-SHA -> 200
    Removed ECDHE-RSA-CHACHA20-POLY1305 -> 200
    Removed ECDHE-ECDSA-AES128-SHA -> 200
    Removed ECDHE-RSA-AES128-GCM-SHA256 -> 200
    Removed ECDHE-ECDSA-CHACHA20-POLY1305 -> 200
    Removed DHE-RSA-AES256-SHA -> 200
    Removed ECDHE-ECDSA-AES256-GCM-SHA384 -> 200
    Removed AES256-SHA -> 200
    Removed DHE-RSA-AES128-SHA -> 200
    
    * Working list, by removing one of these ciphers in both browsers:
    ECDHE-RSA-CHACHA20-POLY1305
    ECDHE-RSA-AES128-GCM-SHA256
    ECDHE-ECDSA-CHACHA20-POLY1305
    ECDHE-ECDSA-AES256-GCM-SHA384
    ECDHE-ECDSA-AES128-GCM-SHA256
    AES256-SHA
    
    +-------------------------------+--------+---------+------------+
    |             Cipher            | Chrome | Firefox | Removable? |
    +-------------------------------+--------+---------+------------+
    |     TLS_AES_128_GCM_SHA256    |   X    |    X    |            |
    |     TLS_AES_256_GCM_SHA384    |   X    |    X    |            |
    |  TLS_CHACHA20_POLY1305_SHA256 |   X    |    X    |            |
    | ECDHE-ECDSA-AES128-GCM-SHA256 |   X    |    X    |    Yes     |
    |  ECDHE-RSA-AES128-GCM-SHA256  |   X    |    X    |    Yes     |
    | ECDHE-ECDSA-AES256-GCM-SHA384 |   X    |    X    |    Yes     |
    | ECDHE-ECDSA-CHACHA20-POLY1305 |   X    |    X    |    Yes     |
    |  ECDHE-RSA-CHACHA20-POLY1305  |   X    |    X    |    Yes     |
    |       AES128-GCM-SHA256       |   X    |         |            |
    |       AES256-GCM-SHA384       |   X    |         |            |
    |           AES256-SHA          |   X    |    X    |    Yes     |
    |     ECDHE-ECDSA-AES256-SHA    |        |    X    |            |
    |     ECDHE-ECDSA-AES128-SHA    |        |    X    |            |
    |       DHE-RSA-AES128-SHA      |        |    X    |            |
    |       DHE-RSA-AES256-SHA      |        |    X    |            |
    +-------------------------------+--------+---------+------------+
    
    Source code(tar.gz)
    Source code(zip)
  • 1.2.5(Oct 23, 2019)

  • 1.2.2(Oct 9, 2019)

Owner
VeNoMouS
Discord: VeNoMouSNZ#5979
VeNoMouS
Library to scrape and clean web pages to create massive datasets.

lazynlp A straightforward library that allows you to crawl, clean up, and deduplicate webpages to create massive monolingual datasets. Using this libr

Chip Huyen 2.1k Jan 06, 2023
A repository with scraping code and soccer dataset from understat.com.

UNDERSTAT - SHOTS DATASET As many people interested in soccer analytics know, Understat is an amazing source of information. They provide Expected Goa

douglasbc 48 Jan 03, 2023
Web Scraping OLX with Python and Bsoup.

webScrap WebScraping first step. Authors: Paulo, Claudio M. First steps in Web Scraping. Project carried out for training in Web Scrapping. The export

claudio paulo 5 Sep 25, 2022
A Spider for BiliBili comments with a simple API server.

BiliComment A spider for BiliBili comment. Spider Usage Put config.json into config directory, and then python . ./config/config.json. A example confi

Hao 3 Jul 05, 2021
Auto Join: A GitHub action script to automatically invite everyone to the organization who star your repository.

Auto Invite To The Organization By Star A GitHub Action script to automatically invite everyone to your organization that stars your repository. What

Max Base 11 Dec 11, 2022
Using Selenium with Python to Web Scrap Popular Youtube Tech Channels.

Web Scrapping Popular Youtube Tech Channels with Selenium Data Mining, Data Wrangling, and Exploratory Data Analysis About the Data Web scrapi

David Rusho 0 Aug 18, 2021
The first public repository that provides free BUBT website scraping API script on Github.

BUBT WEBSITE SCRAPPING SCRIPT I think this is the first public repository that provides free BUBT website scraping API script on github. When I was do

Md Imam Hossain 3 Feb 10, 2022
🥫 The simple, fast, and modern web scraping library

About gazpacho is a simple, fast, and modern web scraping library. The library is stable, actively maintained, and installed with zero dependencies. I

Max Humber 692 Dec 22, 2022
Demonstration on how to use async python to control multiple playwright browsers for web-scraping

Playwright Browser Pool This example illustrates how it's possible to use a pool of browsers to retrieve page urls in a single asynchronous process. i

Bernardas Ališauskas 8 Oct 27, 2022
script to scrape direct download links (ddls) from google drive index.

bhadoo Google Personal/Shared Drive Index scraper. A small script to scrape direct download links (ddls) of downloadable files from bhadoo google driv

sαɴᴊɪᴛ sɪɴʜα 53 Dec 16, 2022
Python script for crawling ResearchGate.net papers✨⭐️📎

ResearchGate Crawler Python script for crawling ResearchGate.net papers About the script This code start crawling process by urls in start.txt and giv

Mohammad Sadegh Salimi 4 Aug 30, 2022
Nekopoi scraper using python3

Features Scrap from url Todo [+] Search by genre [+] Search by query [+] Scrap from homepage Example # Hentai Scraper from nekopoi import Hent

MhankBarBar 9 Apr 06, 2022
Minecraft Item Scraper

Minecraft Item Scraper To run, first ensure you have the BeautifulSoup module: pip install bs4 Then run, python minecraft_items.py folder-to-save-ima

Jaedan Calder 1 Dec 29, 2021
爱奇艺会员,腾讯视频,哔哩哔哩,百度,各类签到

My-Actions 个人收集并适配Github Actions的各类签到大杂烩 不要fork了 ⭐️ star就行 使用方式 新建仓库并同步代码 点击Settings - Secrets - 点击绿色按钮 (如无绿色按钮说明已激活。直接到下一步。) 新增 new secret 并设置 Secr

280 Dec 30, 2022
Python script who crawl first shodan page and check DBLTEK vulnerability

🐛 MASS DBLTEK EXPLOIT CHECKER USING SHODAN 🕸 Python script who crawl first shodan page and check DBLTEK vulnerability

Divin 4 Jan 09, 2022
Scraping and visualising India's real-time COVID-19 data from the MOHFW dataset.

COVID19-WEB-SCRAPER Open Source Tech Lab - Project [SEMESTER IV] OSTL Assignments OSTL Assignments - 1 OSTL Assignments - 2 Project COVID19 India Data

AMEY THAKUR 8 Apr 28, 2022
Web scrapping tool written in python3, using regex, to get CVEs, Source and URLs.

searchcve Web scrapping tool written in python3, using regex, to get CVEs, Source and URLs. Generates a CSV file in the current directory. Uses the NI

32 Oct 10, 2022
A high-level distributed crawling framework.

Cola: high-level distributed crawling framework Overview Cola is a high-level distributed crawling framework, used to crawl pages and extract structur

Xuye (Chris) Qin 1.5k Dec 24, 2022
This repo has the source code for the crawler and data crawled from auto-data.net

This repo contains the source code for crawler and crawled data of cars specifications from autodata. The data has roughly 45k cars

Tô Đức Anh 5 Nov 22, 2022
A simple flask application to scrape gogoanime website.

gogoanime-api-flask A simple flask application to scrape gogoanime website. Used for demo and learning purposes only. How to use the API The base api

1 Oct 29, 2021