Python script to download all images/webms of a 4chan thread

Overview

4chan-downloader

Python script to download all images/webms of a 4chan thread

Download Script

The main script is called inb4404.py and can be called like this: python inb4404.py [thread/filename]

usage: inb4404.py [-h] [-c] [-d] [-l] [-n] [-r] thread

positional arguments:
  thread              url of the thread (or filename; one url per line)

optional arguments:
  -h, --help          show this help message and exit
  -c, --with-counter  show a counter next the the image that has been
                      downloaded
  -d, --date          show date as well
  -l, --less          show less information (surpresses checking messages)
  -n, --use-names     use thread names instead of the thread ids
                      (...4chan.org/board/thread/thread-id/thread-name)
  -r, --reload        reload the queue file every 5 minutes
  -t, --title         save original filenames

You can parse a file instead of a thread url. In this file you can put as many links as you want, you just have to make sure that there's one url per line. A line is considered to be a url if the first 4 letters of the line start with 'http'.

If you use the --use-names argument, the thread name is used to name the respective thread directory instead of the thread id.

Thread Watcher

This is a work-in-progress script but basic functionality is already given. If you call the script like

python thread-watcher.py -b vg -q mhg -f queue.txt -n "Monster Hunter"

then it looks for all threads that include mhg inside the vg board, stores the thread url into queue.txt and adds /Monster-Hunter at the end of the url so that you can use the --use-names argument from the actual download script.

Legacy

The current scripts are written in python3, in case you still use python2 you can use an old version of the script inside the legacy directory.

Comments
  • "Something went wrong"

    Regardless of thread, board, script location, drive letter, attribute, or host machine the script returns "Something went wrong" and times out after several tries. image Above picture showing different threads and boards and different attribute flags, all three failing almost immediately. image Interestingly however it works fine on an arch install. I've tried it on two machines running Windows 10, reinstalled python 3 as well Not entirely sure why it's dropping almost immediately on Windows. Assume this is a Windows problem, but nothing changed since I used it yesterday. will close if I find a solution to using it on windows. sorry there isn't much information to go off of.

    opened by cherioux 16
  • multiple threads from file error

    multiple threads from file error

    File "Python\Python38-32\lib\multiprocessing\process.py", line 315, in _bootstrap self.run() Process Process-1: File "Python\Python38-32\lib\multiprocessing\process.py", line 108, in run self._target(*self._args, **self._kwargs) File "4chdl\inb4404.py", line 44, in download_thread if args.use_names or os.path.exists(os.path.join(workpath, 'downloads', board, thread_tmp)): AttributeError: 'NoneType' object has no attribute 'use_names' Traceback (most recent call last): File "Python\Python38-32\lib\multiprocessing\process.py", line 315, in _bootstrap self.run() File "Programs\Python\Python38-32\lib\multiprocessing\process.py", line 108, in run self._target(*self._args, **self._kwargs) File "4chdl\inb4404.py", line 44, in download_thread if args.use_names or os.path.exists(os.path.join(workpath, 'downloads', board, thread_tmp)): AttributeError: 'NoneType' object has no attribute 'use_names'

    opened by th3illu 8
  • Filenames...

    Filenames...

    I've run into a few issues with filenames. One of them isn't entirely Windows specific, and I have an idea of what needs to be done to fix it, but no idea how to. Duplicate filenames within a thread simply overwrite any preceding files. Usually Spoiler_Image or file.png.

    The other issue is Windows specific and I've managed to solve it at least locally by:

    from django.utils.text import get_valid_filename
    ...snip...
                    img_path = ntpath.join(directory, get_valid_filename(img))
    ...snip...
    

    The issue in question, filenames like this used to make the script halt on Windows:

    C:\vtai>inb4404.py -c -d -l -t https://boards.4channel.org/vt/thread/34806758
    [2022-10-11 07:50:46 PM] [  9/278] vt/34806758/[sound=files.catbox.moe%2F7a2f1v.m4a]{{takanashi kiara}}, {{{1girl}}}, {begging},[[[lipstick]]],[[[lip gloss]]],kusogaki,closed eyes,{{pov}}, {{incoming kiss}}, close-up, {{horny}}, blushing, solo, puffy sleeves, orange skirt, aqua choker, orange hair, ba.png
    Traceback (most recent call last):
      File "C:\vtai\inb4404.py", line 171, in <module>
        main()
      File "C:\vtai\inb4404.py", line 31, in main
        download_thread(thread, args)
      File "C:\vtai\inb4404.py", line 112, in download_thread
        with open(img_path, 'wb') as f:
    OSError: [Errno 22] Invalid argument: 'C:\\vtai\\downloads\\vt\\34806758\\[sound=files.catbox.moe%2F7a2f1v.m4a]{{takanashi kiara}}, {{{1girl}}}, {begging},[[[lipstick]]],[[[lip gloss]]],kusogaki,closed eyes,{{pov}}, {{incoming kiss}}, close-up, {{horny}}, blushing, solo, puffy sleeves, orange skirt, aqua choker, orange hair, ba.png'
    

    After importing it appears to work.

    opened by vt-idiot 5
  • Script can't download files from a thread with a other thread quoted (Cross-thread)

    Script can't download files from a thread with a other thread quoted (Cross-thread)

    NSFW

    How to reproduce:

    1Β° Try to download from this link: https://boards.4chan.org/gif/thread/13521437

    This error appears

    File TTT Part 2 Top Tier Titties \1527053877481.webm from https://i.4cdn.org/gif/1536882732639.webm UNKNOWN ERROR OCCURRED [WinError 183] cannot create an existing file: 'C:\\Users\\mario\\4channer\\TTT Part 2 Top Tier Titties '

    But the "TTT Part 2 Top Tier Tittiles" and the two .webm files doens't exist. Using a custom folder location doens't work.

    opened by ghost 5
  • Corrupt downloads

    Corrupt downloads

    Hi! Just downloaded your script but every file downloaded is corrupt. I picked a random thread https://boards.4chan.org/b/thread/573855146/pink-ids-will-have-pink-eye-for-the-rest-of-their, but also tried with some more with same result.

    Is there any kind of logging i can enable to help you sort this out, if you're not experiencing the same problem?

    Cheers! Luca

    opened by LucaNonato 4
  • auto download/ all images in one folder?

    auto download/ all images in one folder?

    i want all the images of a certian tag to go into just one folder instead of each individual one, is there a way to do that? also is there a feature to autodownload threads in the queue text file when one appears

    question 
    opened by azzerzzzeqwe 3
  • Logo for the Downloader Readme

    Logo for the Downloader Readme

    Hey, i would like to propose a Logo for the readme header and maybe even as icon for a PyQt GUI-version if that one ever gets a go. I have altered the original 4chan Logo.

    opened by ThisLimn0 2
  • invalid syntax?

    invalid syntax?

    I guess I miss something very simple, but cannot figure it out, so here goes nothing

    I followed the instructions to in README.md and typed

    $ python inb4404.py http://boards.4chan.org/w/thread/2096591/lain-thread-anyone-have-any-arisu

    The only result was:

    File "inb4404.py", line 129 print(line.replace(link, '-' + link), end='') ^ SyntaxError: invalid syntax

    And I'm too dumb to know if this is a typo to fix or I should use a different URL (direct one to the first pic in thread maybe?).

    (After hundreds of shameful edits): also the output points at end='' equation, but for love of god I have no idea how to paste it here so markdown does not cut out all spaces.

    opened by tcheerno 2
  • Rewrite/Convert to python3?

    Rewrite/Convert to python3?

    The 2to3 tool should automate this as far as possible. I think the script should be converted to python3 because this is the new standard, support for python2 will be dropped in near future.

    opened by KopfKrieg 2
  • Stuck on checking the first post of the thread

    Stuck on checking the first post of the thread

    The script keeps "checking" the first post of the thread specified as thread_link

    I'm using python 2.7.13, script doesn't work with python 3 because of an error at line 73.

    opened by elleborgo 2
  • Reloading the file doesn't work properly

    Reloading the file doesn't work properly

    Single thread and multiple thread (using a file) works just fine, but the script is unable to reload the file and update the queue properly as of now.

    opened by Exceen 2
  • Configurable threads

    Configurable threads

    I added a lot between commits, which was a mistake, but the rundown is that I re-worked the threaded downloads.

    Previously each 4chan thread would get its own process to check and download new images. I created a queue (manager.list) system and made static workers/processes. By default, 4 will be created, but that is configurable with -p. As 'jobs' are pulled from the queue, a worker thread will work on it, then wait until another job is available to pull from the queue.

    Let me know if you have any questions or want me to change anything.

    Thanks!

    opened by Zand3r24 4
Releases(1.0)
  • 1.0(Jul 23, 2018)

    • Supports downloading images and webm-files of single and multiple (multi-threading) 4chan-threads with continuous checks for new posts.
    • Assign names to threads to locate them easier on your hard drive
    • When using a file to download multiple threads at once, 404'd threads will be marked automatically with a "-" in front of the URL.
    • Separates downloads into a "download" directory which serves as archive and a "new" directory. Downloaded files are put into both directories. If a files is deleted inside the "download" directory is will be downloaded again. On the other hand, if a file inside the "new" directory is deleted it won't be downloaded again. This serves as an easy way to keep a whole thread archived and to track new downloads. Therefore, deleting a file inside the "new" directory serves as some kind of a "mark as read" feature.
    • Thread Watcher is more or less an Add-On for the actual download file which checks for threads including a specified text and adds the respective URL into a text file which can be used with the download script.
    Source code(tar.gz)
    Source code(zip)
Owner
Micha Fink
Micha Fink
A prometheus exporter for torrent downloader like qbittorrent/transmission/deluge

downloader-exporter A prometheus exporter for qBitorrent/Transmission/Deluge. Get metrics from multiple servers and offers them in a prometheus format

Lei Shi 41 Nov 18, 2022
Libretrofuzz - Fuzzy Retroarch thumbnail downloader

Fuzzy Retroarch thumbnail downloader In Retroarch, when you use the manual scann

8 Nov 26, 2022
This is a simple Python Script to download Imgur Pictures with the short url!

Imgur Downloader This is a simple Python Script that runs a process with progress bar that downloads an Imgur Picture! Code Example Features Progress

OGMatrix 1 Nov 18, 2021
YoutubeDownloader - Repo for downloading YT audio and videos

YoutubeDownloader Downloads video/playlist/audio from youtube url. install all t

Anuj SP 2 Feb 17, 2022
Download India Stocks Historical Data

Kite Helper - Download Stock Market Data 🌎 Website Simple Application to Download any stock market data in .csv format using Kite πŸƒβ€β™‚οΈ Running Serve

Pishang Ujeniya 12 Dec 06, 2022
Storing, versioning, and downloading files from S3 made as easy as using open() in Python. Caching included.

open(LARGE) Storing, versioning, and downloading files from S3 made as easy as using open() in Python. Caching included. Motivation Oftentimes, especi

AndrΓ‘s Schmelczer 2 Jan 30, 2022
Download candlestick data fast & easy for analysis

crypto-candlesticks πŸ“ˆ The goal behind this project is to facilitate downloading cryptocurrency candlestick data fast & simple. Currently only the Bit

Pedro Torres 31 Dec 11, 2022
Download Web-10K data by querying Bing Image Search

gpv2-web10k This repository contains the script to download images from the Web-10K dataset. The script takes in a list of queries, queries Bing Image

AI2 8 Sep 06, 2022
Automatically download and crop key information from the arxiv daily paper. (cpu version)

Automatically download and crop key information from the arxiv daily paper. (cpu version)

HeoLis 4 Jul 30, 2022
A toolkit to automatically crawl the paper list and download paper pdfs of ACL Ahthology.

ACL-Anthology-Crawler A toolkit to automatically crawl the paper list and download paper pdfs of ACL Anthology

Ray GG 9 Oct 09, 2022
A downloader for the ISIS service of TU Berlin

isis_dl A downloading utility for the ISIS tool of TU-Berlin. Version 0.4 Features Downloads all Material from all courses of your ISIS page. Efficien

1 Nov 06, 2021
Tool To download 4KHDR DV SDR from AppleTV

# APPLE-TV 4K Downloader Tool To download 4K HDR DV SDR from AppleTV Hello Fellow Developers/ ! Hi! My name is WVDUMP. I am Leaking the scripts to

5 Dec 25, 2021
Persepolis Download Manager is a GUI for aria2.

Persepolis Download Manager Content About FAQ Screenshots Credits About Persepolis is a download manager & a GUI for Aria2. It's written in Python. Pe

Persepolis 5.6k Dec 31, 2022
the best video downloader for terminals (currently only compatible with Linux and Windows)

the best video downloader for terminals (currently only compatible with Linux and Windows)

Amaral 2 Oct 14, 2021
Code for "Adversarial Motion Priors Make Good Substitutes for Complex Reward Functions"

Adversarial Motion Priors Make Good Substitutes for Complex Reward Functions Codebase for the "Adversarial Motion Priors Make Good Substitutes for Com

Alejandro Escontrela 54 Dec 13, 2022
Python script to automate youtube-dl downloads

Automated Download Tool !! Project status I am writing a new version of this program, which will solve several errors. The new version only supports G

Devil64-Dev 21 Sep 22, 2022
A cross platform front-end GUI of the popular youtube-dl written in wxPython.

youtube-dlG A cross platform front-end GUI of the popular youtube-dl media downloader written in wxPython. Supported sites Screenshots Requirements Py

8.7k Dec 31, 2022
YouTube Video publisher using youtube-dl & ROS2🐒

YouTube-publisher-ROS2 Publish sensor_msgs/Image by "YouTube" πŸ€— πŸ€— πŸ€— ! You don't have to use webcamera or your video to check demos. Purpose Quick d

Ar-Ray 5 Dec 04, 2022
Convert BMS songs to osu! With options to convert keysounds and convert to 7key.

bmx2osu Convert BMS to osu! With options to: convert keysounds to one song file using BMX2WAV include 7k version change Overall Difficulty and HP Drai

7 Nov 28, 2022
Apple Music Animated Artwork Fetcher

A python script for downloading the animated artwork of an Apple Music album.

bunny 46 Jan 03, 2023