Python script to download all images/webms of a 4chan thread

Overview

4chan-downloader

Python script to download all images/webms of a 4chan thread

Download Script

The main script is called inb4404.py and can be called like this: python inb4404.py [thread/filename]

usage: inb4404.py [-h] [-c] [-d] [-l] [-n] [-r] thread

positional arguments:
  thread              url of the thread (or filename; one url per line)

optional arguments:
  -h, --help          show this help message and exit
  -c, --with-counter  show a counter next the the image that has been
                      downloaded
  -d, --date          show date as well
  -l, --less          show less information (surpresses checking messages)
  -n, --use-names     use thread names instead of the thread ids
                      (...4chan.org/board/thread/thread-id/thread-name)
  -r, --reload        reload the queue file every 5 minutes
  -t, --title         save original filenames

You can parse a file instead of a thread url. In this file you can put as many links as you want, you just have to make sure that there's one url per line. A line is considered to be a url if the first 4 letters of the line start with 'http'.

If you use the --use-names argument, the thread name is used to name the respective thread directory instead of the thread id.

Thread Watcher

This is a work-in-progress script but basic functionality is already given. If you call the script like

python thread-watcher.py -b vg -q mhg -f queue.txt -n "Monster Hunter"

then it looks for all threads that include mhg inside the vg board, stores the thread url into queue.txt and adds /Monster-Hunter at the end of the url so that you can use the --use-names argument from the actual download script.

Legacy

The current scripts are written in python3, in case you still use python2 you can use an old version of the script inside the legacy directory.

Comments
  • "Something went wrong"

    Regardless of thread, board, script location, drive letter, attribute, or host machine the script returns "Something went wrong" and times out after several tries. image Above picture showing different threads and boards and different attribute flags, all three failing almost immediately. image Interestingly however it works fine on an arch install. I've tried it on two machines running Windows 10, reinstalled python 3 as well Not entirely sure why it's dropping almost immediately on Windows. Assume this is a Windows problem, but nothing changed since I used it yesterday. will close if I find a solution to using it on windows. sorry there isn't much information to go off of.

    opened by cherioux 16
  • multiple threads from file error

    multiple threads from file error

    File "Python\Python38-32\lib\multiprocessing\process.py", line 315, in _bootstrap self.run() Process Process-1: File "Python\Python38-32\lib\multiprocessing\process.py", line 108, in run self._target(*self._args, **self._kwargs) File "4chdl\inb4404.py", line 44, in download_thread if args.use_names or os.path.exists(os.path.join(workpath, 'downloads', board, thread_tmp)): AttributeError: 'NoneType' object has no attribute 'use_names' Traceback (most recent call last): File "Python\Python38-32\lib\multiprocessing\process.py", line 315, in _bootstrap self.run() File "Programs\Python\Python38-32\lib\multiprocessing\process.py", line 108, in run self._target(*self._args, **self._kwargs) File "4chdl\inb4404.py", line 44, in download_thread if args.use_names or os.path.exists(os.path.join(workpath, 'downloads', board, thread_tmp)): AttributeError: 'NoneType' object has no attribute 'use_names'

    opened by th3illu 8
  • Filenames...

    Filenames...

    I've run into a few issues with filenames. One of them isn't entirely Windows specific, and I have an idea of what needs to be done to fix it, but no idea how to. Duplicate filenames within a thread simply overwrite any preceding files. Usually Spoiler_Image or file.png.

    The other issue is Windows specific and I've managed to solve it at least locally by:

    from django.utils.text import get_valid_filename
    ...snip...
                    img_path = ntpath.join(directory, get_valid_filename(img))
    ...snip...
    

    The issue in question, filenames like this used to make the script halt on Windows:

    C:\vtai>inb4404.py -c -d -l -t https://boards.4channel.org/vt/thread/34806758
    [2022-10-11 07:50:46 PM] [  9/278] vt/34806758/[sound=files.catbox.moe%2F7a2f1v.m4a]{{takanashi kiara}}, {{{1girl}}}, {begging},[[[lipstick]]],[[[lip gloss]]],kusogaki,closed eyes,{{pov}}, {{incoming kiss}}, close-up, {{horny}}, blushing, solo, puffy sleeves, orange skirt, aqua choker, orange hair, ba.png
    Traceback (most recent call last):
      File "C:\vtai\inb4404.py", line 171, in <module>
        main()
      File "C:\vtai\inb4404.py", line 31, in main
        download_thread(thread, args)
      File "C:\vtai\inb4404.py", line 112, in download_thread
        with open(img_path, 'wb') as f:
    OSError: [Errno 22] Invalid argument: 'C:\\vtai\\downloads\\vt\\34806758\\[sound=files.catbox.moe%2F7a2f1v.m4a]{{takanashi kiara}}, {{{1girl}}}, {begging},[[[lipstick]]],[[[lip gloss]]],kusogaki,closed eyes,{{pov}}, {{incoming kiss}}, close-up, {{horny}}, blushing, solo, puffy sleeves, orange skirt, aqua choker, orange hair, ba.png'
    

    After importing it appears to work.

    opened by vt-idiot 5
  • Script can't download files from a thread with a other thread quoted (Cross-thread)

    Script can't download files from a thread with a other thread quoted (Cross-thread)

    NSFW

    How to reproduce:

    1° Try to download from this link: https://boards.4chan.org/gif/thread/13521437

    This error appears

    File TTT Part 2 Top Tier Titties \1527053877481.webm from https://i.4cdn.org/gif/1536882732639.webm UNKNOWN ERROR OCCURRED [WinError 183] cannot create an existing file: 'C:\\Users\\mario\\4channer\\TTT Part 2 Top Tier Titties '

    But the "TTT Part 2 Top Tier Tittiles" and the two .webm files doens't exist. Using a custom folder location doens't work.

    opened by ghost 5
  • Corrupt downloads

    Corrupt downloads

    Hi! Just downloaded your script but every file downloaded is corrupt. I picked a random thread https://boards.4chan.org/b/thread/573855146/pink-ids-will-have-pink-eye-for-the-rest-of-their, but also tried with some more with same result.

    Is there any kind of logging i can enable to help you sort this out, if you're not experiencing the same problem?

    Cheers! Luca

    opened by LucaNonato 4
  • auto download/ all images in one folder?

    auto download/ all images in one folder?

    i want all the images of a certian tag to go into just one folder instead of each individual one, is there a way to do that? also is there a feature to autodownload threads in the queue text file when one appears

    question 
    opened by azzerzzzeqwe 3
  • Logo for the Downloader Readme

    Logo for the Downloader Readme

    Hey, i would like to propose a Logo for the readme header and maybe even as icon for a PyQt GUI-version if that one ever gets a go. I have altered the original 4chan Logo.

    opened by ThisLimn0 2
  • invalid syntax?

    invalid syntax?

    I guess I miss something very simple, but cannot figure it out, so here goes nothing

    I followed the instructions to in README.md and typed

    $ python inb4404.py http://boards.4chan.org/w/thread/2096591/lain-thread-anyone-have-any-arisu

    The only result was:

    File "inb4404.py", line 129 print(line.replace(link, '-' + link), end='') ^ SyntaxError: invalid syntax

    And I'm too dumb to know if this is a typo to fix or I should use a different URL (direct one to the first pic in thread maybe?).

    (After hundreds of shameful edits): also the output points at end='' equation, but for love of god I have no idea how to paste it here so markdown does not cut out all spaces.

    opened by tcheerno 2
  • Rewrite/Convert to python3?

    Rewrite/Convert to python3?

    The 2to3 tool should automate this as far as possible. I think the script should be converted to python3 because this is the new standard, support for python2 will be dropped in near future.

    opened by KopfKrieg 2
  • Stuck on checking the first post of the thread

    Stuck on checking the first post of the thread

    The script keeps "checking" the first post of the thread specified as thread_link

    I'm using python 2.7.13, script doesn't work with python 3 because of an error at line 73.

    opened by elleborgo 2
  • Reloading the file doesn't work properly

    Reloading the file doesn't work properly

    Single thread and multiple thread (using a file) works just fine, but the script is unable to reload the file and update the queue properly as of now.

    opened by Exceen 2
  • Configurable threads

    Configurable threads

    I added a lot between commits, which was a mistake, but the rundown is that I re-worked the threaded downloads.

    Previously each 4chan thread would get its own process to check and download new images. I created a queue (manager.list) system and made static workers/processes. By default, 4 will be created, but that is configurable with -p. As 'jobs' are pulled from the queue, a worker thread will work on it, then wait until another job is available to pull from the queue.

    Let me know if you have any questions or want me to change anything.

    Thanks!

    opened by Zand3r24 4
Releases(1.0)
  • 1.0(Jul 23, 2018)

    • Supports downloading images and webm-files of single and multiple (multi-threading) 4chan-threads with continuous checks for new posts.
    • Assign names to threads to locate them easier on your hard drive
    • When using a file to download multiple threads at once, 404'd threads will be marked automatically with a "-" in front of the URL.
    • Separates downloads into a "download" directory which serves as archive and a "new" directory. Downloaded files are put into both directories. If a files is deleted inside the "download" directory is will be downloaded again. On the other hand, if a file inside the "new" directory is deleted it won't be downloaded again. This serves as an easy way to keep a whole thread archived and to track new downloads. Therefore, deleting a file inside the "new" directory serves as some kind of a "mark as read" feature.
    • Thread Watcher is more or less an Add-On for the actual download file which checks for threads including a specified text and adds the respective URL into a text file which can be used with the download script.
    Source code(tar.gz)
    Source code(zip)
Owner
Micha Fink
Micha Fink
Make YouTube videos tasks in Todoist faster and time efficient!

Youtubist Basically fork of yt-dlp python module to my needs. You can paste playlist or channel link on the YouTube. It will automatically format to s

Konrad Konieczny 1 Dec 04, 2022
Music and video downloader, Made with love by Bryan Herrera

Python-Mp3Mp4-Downloader Music and video downloader, Made with love by Bryan Herrera Requirements CHOCOLATELY windows command If your system does not

ርᚱ1ናተᛰ ᚻህᚥተპᚱ 104 Dec 27, 2022
A Simple YouTube Video Downloader With Python

Simple YouTube Video Downloader Simple YouTube Video Downloader is an open source project with a very simple UI that tries to speed up the process of

Brian Han 2 Jan 03, 2022
Download YouTube videos that are available in the given playlist

Youtube-Playlist-Downloader Download YouTube videos that are available in the given playlist Project assets: music downloaded music folder. (will be g

Sultan Aljaberi 1 Dec 22, 2021
A Python script that allows you to download all of an anime's episodes at once.

BitAnime A Python script that allows you to download all of an anime's episodes at once. · Download executable version · About BitAnime BitAnime is a

sh1nobu 17 Aug 10, 2022
Noto fonts go universal! Download Noto fonts combined to suit your region (South Asia, SE Asia, Africa-MiddleEast, Europe-Americas).

Go Noto Universal Noto fonts go universal! Download Noto fonts combined to suit your region (South Asia, SE Asia, East Asia, Africa-MiddleEast, Europe

Satish B 67 Jan 06, 2023
Fully automated download and parsing for Texas A&M University's Registrar's grade distribution PDFs for years 2014+.

Fully automated download and parsing for Texas A&M University's Registrar's grade distribution PDFs for years 2014+. Adds the parsing results to a mySQL database.

TAMU Grade Distribution 1 Sep 28, 2022
VK sticker downloader with python

VK Sticker Downloader This repository is used to automate download file from VK Sticker How to use Execute the file ./downloader.py Writedown full url

Hartawan Bahari M. 1 Dec 29, 2021
Simple Youtube Video Downloader

Simple Youtube Video Downloader Download Youtube video using link and Will output result in D:/ (You can change the path in main.py file) Installation

Hansen Gianto 1 Oct 28, 2021
mescrappy - Python + Selenium Youtube scraper

mescrappy - Python + Selenium Youtube scraper Youtube Sraping With Python (Selenium) Table of Contents About The Project Built With Getting Started In

Merdan Chariyarov 12 Nov 28, 2021
nextdl - download videos from youtube.com or other video platforms

nextdl - download videos from youtube.com or other video platforms

3 Feb 02, 2022
bing image downloader app used to download bulk images for a specific search term created using streamlit and bing_image_downloader python packages

bing image downloader app bing image downloader app is used to download bulk images for a specific search term. bing image downloader app gets the sea

Siva Prakash 8 Apr 05, 2022
AkShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库

Overview AkShare requires Python(64 bit) 3.7 or greater, aims to make fetch financial data as convenient as possible. Write less, get more! Documentat

Albert King 5.8k Jan 03, 2023
VD Song Bot - A telegram bot that can download songs

VD Song Bot A telegram bot that can download songs Reach me on Telegram @MusicVNDbot Deploy to Heroku The easiest way to deploy this Song Bot Mandator

Venuja Thilakarathna 2 Feb 19, 2022
Python Program that downloads gaming required packages based on your Linux Distribution.

LibreGaming Python Program that downloads gaming required packages based on your Linux Distribution. Table of contents Distributions Prerequisites Dep

Ahmed Al Balochi 195 Jan 01, 2023
Download a large file from Google Drive (curl/wget fails because of the security notice).

gdown Download a large file from Google Drive. Description Download a large file from Google Drive. If you use curl/wget, it fails with a large file b

Kentaro Wada 2.7k Jan 09, 2023
Download minecraft head or skin, allows TLauncher accounts

Download minecraft head or skin, allows TLauncher accounts

1 Dec 30, 2021
The free and open-source Download Manager written in pure Python

The free and open-source Download Manager written in pure Python

pyLoad 2.7k Dec 31, 2022
⚙️ A CLI tool that can download songs from youtube.

⚙️ Music Downloader Music Downloader is a tool that can download songs from Youtube. Installation Base requirements: Python 3.7+ If you have Python 3.

matjs 4 Nov 03, 2021
Download from HBO-MAX-BLIM-TV-Paramount

#HBO MAX- BlimTV -Paramount plus 4K Downloader Tool To download 4K HDR DV SDR from HBO MAX- BlimTV -Paramount plus Hello Fellow Developers/ ! Hi! M

4 Dec 25, 2021