Amazon Scraper: A command-line tool for scraping Amazon product data

Overview

Amazon Product Scraper: 2021

Description

A command-line tool for scraping Amazon product data to CSV or JSON format(s).

Requirements

  • Python 3
  • pip3

Installation

Using git clone (you'll need git installed for this):

git clone https://github.com/scrapewalrus/amazon-scraper-python-2021

Or download and extract the zip file of the project manually

You'll also need to install requirements for the project to run. Locate amazon-product-scraper folder via terminal and type pip install -r requirements.txt:

Usage

To launch the Amazon scraper locate the amazon-product-scraper folder via terminal and type python amazon_scraper.py -k "your keyword". This will start the program.

NOTE: you must declare either -k or --keyword before entering your keyword. It's a required argument.

Example:

amazon_scraper.py - the name of a scraper file.

-k or --keyword - required argument to pass before entering your keyword.

-p or --proxies - optional argument to enable proxies. To avoid getting blocked I highly recommend using proxies. I'm using Residential Proxies from Oxylabs. For highest success rate, I suggest Residential Proxies over Datacenter as they're almost impossible to detect and have the smallest footprint. If you decide to use different proxy provider services keep in mind that you'll have to make some minor adjustments in get-proxies.py file.

-j or --json - optional argument for storing extracted data in .json format. Default output format is .csv.

Example of product data #1: JSON

[
    {
        "SOURCE_URL": "https://www.amazon.com/s?k=funny+t+shirt+for+women&page=1",
        "PAGE": 1,
        "KEYWORD": "funny t shirt for women",
        "PRODUCT_LINK": "https://www.amazon.com/Mostly-T-Shirt-Womens-Letter-Printed/dp/B07QN2NQ59/ref=sr_1_3?dchild=1&keywords=funny+t+shirt+for+women&qid=1627833682&sr=8-3",
        "PRODUCT_NAME": "I'm Mostly Peace Love and Light Funny T-Shirt Womens Graphic Printed Short Sleeve Tops Tee",
        "PRICE": "$21.99",
        "PRODUCT_RATING": "4.6",
        "NUMBER_OF_RATINGS": "1,637"
    },
    {
        "SOURCE_URL": "https://www.amazon.com/s?k=funny+t+shirt+for+women&page=1",
        "PAGE": 1,
        "KEYWORD": "funny t shirt for women",
        "PRODUCT_LINK": "https://www.amazon.com/YITAN-Women-Graphic-Funny-X-Large/dp/B074QMG4D7/ref=sr_1_4?dchild=1&keywords=funny+t+shirt+for+women&qid=1627833682&sr=8-4",
        "PRODUCT_NAME": "YITAN Women's Cute Juniors Tops Teen Girl Tee Funny T Shirt",
        "PRICE": "$12.99",
        "PRODUCT_RATING": "4.6",
        "NUMBER_OF_RATINGS": "12,281"
    },
    {
        "SOURCE_URL": "https://www.amazon.com/s?k=funny+t+shirt+for+women&page=1",
        "PAGE": 1,
        "KEYWORD": "funny t shirt for women",
        "PRODUCT_LINK": "https://www.amazon.com/DANVOUY-Womens-V-Neck-Doesnt-Definitely/dp/B07V55ZXVS/ref=sr_1_5?dchild=1&keywords=funny+t+shirt+for+women&qid=1627833682&sr=8-5",
        "PRODUCT_NAME": "DANVOUY Womens If My Mouth Doesn't Say It My Face Definitely Will T Shirt",
        "PRICE": "$12.99",
        "PRODUCT_RATING": "4.6",
        "NUMBER_OF_RATINGS": "6,787"
    }]

Example of product data #2: CSV

amazon-csv-product-data-example

CLI tool to fix linked references for dates.

Fix Logseq dates This is a CLI tool to fix the date references following a change in date format since the current version (0.4.4) of Logseq does not

Isaac Dadzie 5 May 18, 2022
dsub is a command-line tool that makes it easy to submit and run batch scripts in the cloud.

Open-source command-line tool to run batch computing tasks and workflows on backend services such as Google Cloud.

Data Biosphere 233 Jan 01, 2023
An ZFS administration tool inspired on Midnight commander

ZC - ZFS Commander An ZFS administration tool inspired on Midnight commander Work in Progress Description ZFS Commander is a simple front-end for the

34 Dec 07, 2022
MsfMania is a command line tool developed in Python that is designed to bypass antivirus software on Windows and Linux/Mac in the future

MsfMania MsfMania is a command line tool developed in Python that is designed to bypass antivirus software on Windows and Linux/Mac in the future. Sum

446 Dec 21, 2022
pyNPS - A cli Linux and Windows Nopaystation client made with python 3 and wget

Currently, all the work is being done inside the refactoring branch. pyNPS - A cli Linux and Windows Nopaystation client made with python 3 and wget P

Everton Correia 45 Dec 11, 2022
A command line tool made in Python for the popular rhythm game

osr!name A command line tool made in Python for the popular rhythm game "osu!" that changes the player name of a .osr file (replay file). Example: Not

2 Dec 28, 2021
Customisable pharmacokinetic model accessible via bash CLI allowing for variable dose calculations as well as intravenous and subcutaneous administration calculations

Pharmacokinetic Modelling Group Project A PharmacoKinetic (PK) modelling function for analysis of injected solute dynamics over time, developed by Gro

1 Oct 24, 2021
Pymongo based CLI client, to run operation on existing databases and collections

Mongodb-Operations-Console Pymongo based CLI client, to run operation on existing databases and collections Program developed by Gustavo Wydler Azuaga

Gus 1 Dec 01, 2021
liquidctl – liquid cooler control Cross-platform tool and drivers for liquid coolers and other devices

Cross-platform CLI and Python drivers for AIO liquid coolers and other devices

1.7k Jan 08, 2023
Freaky fast fuzzy Denite/CtrlP matcher for vim/neovim

Freaky fast fuzzy Denite/CtrlP matcher for vim/neovim This is a matcher plugin for denite.nvim and CtrlP.

Raghu 113 Sep 29, 2022
Python-based implementation and comparison of strategies to guess words at Wordle

Solver and comparison of strategies for Wordle Motivation The goal of this repository is to compare, in terms of performance, strategies that minimize

Ignacio L. Ibarra 4 Feb 16, 2022
A linux-like remote terminal for Micropython

A linux-like remote terminal for Micropython

Christian Köver - Draxl 2 Nov 14, 2021
A basic molecule viewer written in Python, using curses; Thus, meant for linux terminals

asciiMOL A basic molecule viewer written in Python, using curses; Thus, meant for linux terminals. This is an alpha version, featuring: Opening defaul

Dominik Behrens 328 Dec 11, 2022
A simple python implementation of a reverse shell

llehs A python implementation of a reverse shell. Note for contributors The project is open for contributions and is hacktoberfest registered! llehs u

Archisman Ghosh 2 Jul 05, 2022
CLI tool for one-line installation of C++/CMake projects.

cmakip When working on virtual environments, Python projects can be installed with a single command invocation, for example pip install --no-deps . .

Artificial and Mechanical Intelligence 4 Feb 15, 2022
A CLI/Shell supporting OpenRobot API and more!

A CLI/Shell supporting JeyyAPI, OpenRobot API and RePI API.

OpenRobot Packages 1 Jan 06, 2022
Command-line script to upload videos to Youtube using theYoutube APIv3.

Introduction Command-line script to upload videos to Youtube using theYoutube APIv3. It should work on any platform (GNU/Linux, BSD, OS X, Windows, ..

Arnau Sanchez 1.9k Jan 09, 2023
CLI helper to install Github releases on your system.

gh-release-install is a CLI helper to install Github releases on your system. It can be used for pretty much anything, to install a formatter in your CI, deploy some binary using an orcherstration to

Jonas L. 28 Nov 06, 2022
Voidlx is a terminal cli apps launcher made in python

Voidlx is a terminal cli apps launcher made in python

2 Nov 13, 2021
CLI tool to view your VIT timetable from terminal anytime!

VITime CLI tool to view your timetable from terminal anytime! Table of contents Preview Installation PyPI Source code Updates Setting up Add timetable

16 Oct 04, 2022