A Pixiv web crawler module

Last update: Nov 14, 2021

Related tags

Web Crawling Pixiv-spider

Overview

Pixiv-spider

A Pixiv spider module

WARNING

It's an unfinished work, browsing the code carefully before using it.

Features

0004 -

Readme.md updated, comments fixed, variable names fixed.

0003 -

Name changed to "Pixiv-spider", bugs fixed, ugoira added.

Installation

Clone or download this repository than get into it and input on your terminal:

python ./setup.py install

Usage

classes

LastestPicGetter - Picker to get the lastest artwork by get method
Artwork - Format to request and parse artwork data

Example

def main() :

    COOKIE = ""
    # Use Your cookie if you want to login.
    try : 
        with open("./COOKIE.key") as ios:
            COOKIE = ios.readline()
    except :
        print("COOKIE.key not found")

    UA = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36"
    # User-Agent
    
    PROXIES = {"http":"socks5://127.0.0.1:10808", 
    "https":"socks5://127.0.0.1:10808"}
    # Proxies if needed

    keyword = "艦これ"
    # Keyword for searching
    
    mode = "safe" #"r18" or "safe"
    # Logging in is necessary if using R-18 mode

    picker = LastestPicGetter(keyword, mode = mode,
    cookie = COOKIE,
    UA = UA, 
    proxies = PROXIES)
    #Create a picker by get method

    for i in range(5, 6) :

        picker.request(i)
        picker.parsing()

        print("Result:", list(picker.result.keys()))
        print("Last page:", picker.last_page)

        # picker.request_all()
        picker.download_path_all(".\\pics\\")


if __name__ == "__main__":
    main()

MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy

of this software and associated documentation files (the "Software"), to deal

in the Software without restriction, including without limitation the rights

to use, copy, modify, merge, publish, distribute, sublicense, and/or sell

copies of the Software, and to permit persons to whom the Software is

furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all

copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE

AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,

OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE

SOFTWARE.

A Pixiv web crawler module

Related tags

Overview

Pixiv-spider

WARNING

Features

Installation

Usage

Example

MIT License

Owner

Uzuki

PaperRobot: a paper crawler that can quickly download numerous papers, facilitating paper studying and management

京东抢茅台，秒杀成功很多次讨论，天猫抢购，赚钱交流等。

A python script to extract answers to any question on Quora (Quora+ included)

爱奇艺会员,腾讯视频,哔哩哔哩,百度,各类签到

Scraping web pages to get data

An application that on a given url, crowls a web page and gets all words, sorts and counts them.

A high-level distributed crawling framework.

Using Selenium with Python to Web Scrap Popular Youtube Tech Channels.

Explore scraping with BeautifulSoup!

Simple tool to scrape and download cross country ski timings and results from live.skidor.com

Scrapes the Sun Life of Canada Philippines web site for historical prices of their investment funds and then saves them as CSV files.

crypto currency scraping

Web Content Retrieval for Humans™

Extract gene TSS site form gencode/ensembl/gencode database GTF file and export bed format file.

京东茅台抢购最新优化版本，京东秒杀，添加误差时间调整，优化了茅台抢购进程队列

Dex-scrapper - Hobby project for scrapping dex data on VeChain

Open Crawl Vietnamese Text

A tool to easily scrape youtube data using the Google API

🤖 Threaded Scraper to get discord servers from disboard.org written in python3

Simple Web scrapper Bot to scrap webpages using Requests, html5lib and Beautifulsoup.