A simple, configurable and expandable combined shop scraper to minimize the costs of ordering several items

Last update: Dec 13, 2021

Overview

combined-shop-scraper

A simple, configurable and expandable combined shop scraper to minimize the costs of ordering several items.

Features

Define an input file components.json with components to be scraped and the source urls
Find the cheapest order combination including the shipping prices
Get alarm prices when single components are below a defined price
Easily expand for new shops (scraping basic know-how required). Default basic support for notebooksbilliger, cyberport and future-x

Usage

JSON file definition

The default name of the input JSON file is components.json and must be located in the same folder as scraper.py. This is the basic structure of the file:

{
  "component1": {
    "alarm_price": 260,
    "quantity": 1,
    "urls": [
      "https://www.someshop.com/component1",
      "https://www.someshop.com/component1-alternative",
      "https://www.anothershop.com/component1-alternative"]
  },
  "component2": {
    "urls": [
      "https://www.someshop.com/component2",
      "https://www.anothershop.com/component2",
      "https://www.onemoreshop.com/component2"]
  }

The component name and at least one url are mandatory. It is possible to add several urls from the same shop for the same component if there are some alternatives for this. The quantity of each component defaults to 1, the alarm price is optional.

Execution

Just call the script scraper.py from within the folder, so the components.json file can be found. It will print an overview of the ideal order to minimize the overall cost. The program runs just once and does not keep tracking prices in the background. As usual with scraping, be gentle and fair and don't abuse this program.

Addition of new shops

If you want to add a new shop, you need to edit the file shops.py and:

Enter the significant part of the shop url in the method Shop._get_shops_dict and define a new class type (child of Shop)
Implement the methods _process_soup and get_shipping_cost for the new class. Use the existing classes as reference for the data you need to scrap.
Add your new urls to the input file!

License

See the LICENSE for license details.

A simple, configurable and expandable combined shop scraper to minimize the costs of ordering several items

Related tags

Overview

combined-shop-scraper

Features

Usage

JSON file definition

Execution

Addition of new shops

License

Owner

A simple reddit scraper to get memes (only images) from r/ProgrammerHumor.

A web scraper that exports your entire WhatsApp chat history.

Python script who crawl first shodan page and check DBLTEK vulnerability

Consulta de CPF e CNPJ na Receita Federal com Web-Scraping

Demonstration on how to use async python to control multiple playwright browsers for web-scraping

API to parse tibia.com content into python objects.

jd_maotai rpa 基于selenium驱动的jd抢购rpa机器人

京东茅台抢购 2021年4月最新版

A simple Discord scraper for discord bots

PS5 bot to find a console in france for chrismas 🎄🎅🏻 NOT FOR SCALPERS

PyQuery-based scraping micro-framework.

Crawler in Python 3.7, 3.8. 3.9. Pypy3

An Automated udemy coupons scraper which scrapes coupons and autopost the result in blogspot post

This program will help you to properly scrape all data from a specific website

Using Selenium with Python to Web Scrap Popular Youtube Tech Channels.

A leetcode scraper to compile all questions in leetcode free tier to text file. pdf also available.

Instagram profile scrapper with python

Scrape puzzle scrambles from csTimer.net

Instagram_scrapper - This project allow you to scrape the list of followers, following or both from a public Instagram account, and create a csv or excel file easily.

淘宝、天猫半价抢购，抢电视、抢茅台，干死黄牛党