爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷、爱奇艺、携程、12306、58、搜狐、百度指数、维普万方、Zlibraty、Oalib、小说、招标网、采购网、小红书》

Overview

lxSpider

爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷、爱奇艺、携程、12306、58、搜狐、百度指数、维普万方、Zlibraty、Oalib、小说网站、招标采购网》

简介

  • csdn csdn
  • 时光荏苒,记不清写了多少案例了。作者文章发布在csdn,代码随后往github上更新。csdn部分文章为收费案例,合理订阅。

声明

  • 本库以教学为基准、本库提供的可操作性不得用于任何商业用途和违法违规场景。

  • 作者对任何原因在使用本库中提供的代码和策略时可能对用户自己或他人造成的任何形式的损失和伤害不承担责任。

  • 因本库引起的或与之有关的任何争议,各方应友好协商解决,协商不成的任何后果与作者无关。


专栏

网络爬虫基础 : 适合有python语法基础 准备学爬虫的同学

web逆向基础 : 有爬虫经验即可(包含猿人学爬虫题目解析)

安卓逆向基础 :工具介绍、逆向记录、案例分享

爬虫案例合集 :付费专栏、经典案例、持续更新


目录

博客

推荐

交流

avatar

You might also like...
Releases(快手弹幕采集工具)
  • 快手弹幕采集工具(Jan 30, 2021)

    使用说明:

    • 1、启动dist目录下的run.exe程序。
    • 2、填入主播uid,你的cookie,房间id
    • 3、点击启动后,等待即可,不可重复点击。
    • 4、需要确认主播当前是否还在直播。

    参数获取:

    主播uid: 浏览器上的网址最后一个参数。

    比如网址为: https://live.kuaishou.com/u/yingjia2019

    主播的uid为: yingjia2019

    你的cookie:

    • 1、打开控制台,鼠标右键点击审查元素或者按F12.
    • 2、点击控制台的Network。
    • 3、刷新页面,可已按F5刷新
    • 4、找到和主播uid一样html文件,然后点击右侧的headers
    • 5、鼠标划到最下面找到cookie一行。复制里面的did=web_xxxxxxxxxxxxxx;
    • 6、需要在软件上填入的cookie是 web_xxxxxxxxxxxxxx

    房间id:

    • 1、点击控制台的 Elements,按ctrl+F,打开搜索框。输入: live-stream-id
    • 2、复制 live-stream-id="Zo9Upaz8w90"
    • 3、要输入的房间id是 Zo9Upaz8w90

    运行时最好保持页面打开,关闭页面后过一段时间会导致cookie失效。

    此工具以学习为主,禁止滥用

    Source code(tar.gz)
    Source code(zip)
    default.rar(21.47 MB)
  • 小说下载器(Feb 2, 2021)

    简介

    1、小说下载(优势:速度快,直接从网络上搜集完整txt文件速度快) 2、在线小说爬取(优势:资源全,已上架的小说几乎都能找到)

    特别声明:

    • 本脚本仅用于测试和学习研究,禁止用于商业用途,不能保证其合法性,准确性,完整性和有效性,请根据情况自行判断。

    • 本项目内所有资源文件,禁止任何公众号、自媒体进行任何形式的转载、发布。

    • 本项目内任何脚本问题概不负责,包括但不限于由任何脚本错误导致的任何损失或损害.

    • 请勿将项目的任何内容用于商业或非法目的,否则后果自负。

    • 本项目遵循GPL-3.0 License协议,如果本特别声明与GPL-3.0 License协议有冲突之处,以本特别声明为准。

    Source code(tar.gz)
    Source code(zip)
    default.zip(44.16 MB)
Owner
lx
Every noble work is at first impossible.
lx
Parse feeds in Python

feedparser - Parse Atom and RSS feeds in Python. Copyright 2010-2020 Kurt McKee Kurt McKee 1.5k Dec 30, 2022

A simplistic scraper made to download tons of random screenshots made by people.

printStealer 1.1 What is this tool? This tool is developed to show the insecurity of the screenshot utility called prnt sc. It is a site that stores s

appelsiensam 4 Jul 26, 2022
Extract embedded metadata from HTML markup

extruct extruct is a library for extracting embedded metadata from HTML markup. Currently, extruct supports: W3C's HTML Microdata embedded JSON-LD Mic

Scrapinghub 725 Jan 03, 2023
Pro Football Reference Game Data Webscraper

Pro Football Reference Game Data Webscraper Code Copyright Yeetzsche This is a simple Pro Football Reference Webscraper that can either collect all ga

6 Dec 21, 2022
Universal Reddit Scraper - A comprehensive Reddit scraping command-line tool written in Python.

Universal Reddit Scraper - A comprehensive Reddit scraping command-line tool written in Python.

Joseph Lai 543 Jan 03, 2023
Example of scraping a paginated API endpoint and dumping the data into a DB

Provider API Scraper Example Example of scraping a paginated API endpoint and dumping the data into a DB. Pre-requisits Python = 3.9 Pipenv Setup # i

Alex Skobelev 1 Oct 20, 2021
🤖 Threaded Scraper to get discord servers from disboard.org written in python3

Disboard-Scraper Threaded Scraper to get discord servers from disboard.org written in python3. Setup. One thread / tag If you whant to look for multip

Ѵιcнч 11 Nov 01, 2022
Automatically scrapes all menu items from the Taco Bell website

Automatically scrapes all menu items from the Taco Bell website. Returns as PANDAS dataframe.

Sasha 2 Jan 15, 2022
A package that provides you Latest Cyber/Hacker News from website using Web-Scraping.

cybernews A package that provides you Latest Cyber/Hacker News from website using Web-Scraping. Latest Cyber/Hacker News Using Webscraping Developed b

Hitesh Rana 4 Jun 02, 2022
Scrapping the data from each page of biocides listed on the BAUA website into a csv file

Scrapping the data from each page of biocides listed on the BAUA website into a csv file

Eric DE MARIA 1 Nov 30, 2021
This program will help you to properly scrape all data from a specific website

This program will help you to properly scrape all data from a specific website

MD. MINHAZ 0 May 15, 2022
A leetcode scraper to compile all questions in leetcode free tier to text file. pdf also available.

A leetcode scraper to compile all questions in leetcode free tier to text file, pdf also available. if new questions get added, run again to get new questions.

3 Dec 07, 2021
京东抢茅台,秒杀成功很多次讨论,天猫抢购,赚钱交流等。

Jd_Seckill 特别声明: 请添加个人微信:19972009719 进群交流讨论 目前群里很多人抢到【扫描微信添加群就好,满200关闭群,有喜欢薅信用卡羊毛的也可以找我交流】 本仓库发布的jd_seckill项目中涉及的任何脚本,仅用于测试和学习研究,禁止用于商业用途,不能保证其合法性,准确性

50 Jan 05, 2023
Quick Project made to help scrape Lexile and Atos(AR) levels from ISBN

Lexile-Atos-Scraper Quick Project made to help scrape Lexile and Atos(AR) levels from ISBN You will need to install the chrome webdriver if you have n

1 Feb 11, 2022
WebScraper - A script that prints out a list of all EXTERNAL references in the HTML response to an HTTP/S request

Project A: WebScraper A script that prints out a list of all EXTERNAL references

2 Apr 26, 2022
抖音批量下载用户所有无水印视频

Douyincrawler 抖音批量下载用户所有无水印视频 Run 安装python3, 安装依赖

28 Dec 08, 2022
CreamySoup - a helper script for automated SourceMod plugin updates management.

CreamySoup/"Creamy SourceMod Updater" (or just soup for short), a helper script for automated SourceMod plugin updates management.

3 Jan 03, 2022
PyQuery-based scraping micro-framework.

demiurge PyQuery-based scraping micro-framework. Supports Python 2.x and 3.x. Documentation: http://demiurge.readthedocs.org Installing demiurge $ pip

Matias Bordese 109 Jul 20, 2022
Amazon web scraping using Scrapy Framework

Amazon-web-scraping-using-Scrapy-Framework Scrapy Scrapy is an application framework for crawling web sites and extracting structured data which can b

Sejal Rajput 1 Jan 25, 2022
VG-Scraper is a python program using the module called BeautifulSoup which allows anyone to scrape something off an website. This program lets you put in a number trough an input and a number is 1 news article.

VG-Scraper VG-Scraper is a convinient program where you can find all the news articles instead of finding one yourself. Installing [Linux] Open a term

3 Feb 13, 2022