This is a simple website crawler which asks for a website link from the user to crawl and find specific data from the given website address.

Last update: Jan 10, 2022

Related tags

Web Crawling Website-Crawler-Python-

Overview

Website-Crawler-Python

This is a simple website crawler which asks for a website link from the user to crawl and find specific data from the given website address. After getting the website address, it asks for how much crawling depth the user wants in between the number of links has been found after providing the website address.

Website Crawler takes 3 inputs:

A website address
Integer value for the crawling depth
A user specified regular expression to find user specific data

General tasks:

Find all the Nowgegian mobile numbers and saves into a text file.
Find all the sub-links inside the given website and saves into a text file.
Saves the website's raw HTML code into a text file.
Find all email addresses and save into a text file.
Find all the comments used in the website and saves it into a text file.
Find five most used words and print it into the terminal.

This is a Python based project and used some dependent libraries to execute the functionalities.

RegEx
Urllib3
BeautifulSoup 4
Counter in Collections

This is a simple website crawler which asks for a website link from the user to crawl and find specific data from the given website address.

Related tags

Overview

Website-Crawler-Python

Owner

Faisal Ahmed

京东茅台抢购 2021年4月最新版

A list of Python Bots used to extract data from several websites

A Very simple free proxy list scraper.

Download images from forum threads

This code will be able to scrape movies from a movie website and also provide download links to newly uploaded movies.

基于Github Action的定时HITsz疫情上报脚本，开箱即用

Scrapy uses Request and Response objects for crawling web sites.

Google Scholar Web Scraping

Scrapy-soccer-games - Scraping information about soccer games from a few websites

Anonymously scrapes onlinesim.ru for new usable phone numbers.

Tool to scan for secret files on HTTP servers

A web crawler script that crawls the target website and lists its links

Dictionary - Application focused on word search through web scraping

Unja is a fast & light tool for fetching known URLs from Wayback Machine

Create crawler get some new products with maximum discount in banimode website

This project was created using Python technology and flask tools to scrape a music site

script to scrape direct download links (ddls) from google drive index.

The open-source web scrapers that feed the Los Angeles Times California coronavirus tracker.

This scrapper scrapes the mail ids of faculty members from a given linl/page and stores it in a csv file

Async Python 3.6+ web scraping micro-framework based on asyncio