Pretty sure both Firefox and Opera can do this if you're browsing 4chan with them. If for whatever reason you want something standalone, you'll have to write it yourself (there's probably hundreds of these utils written already: some people posted some on /prog/ not long ago if you're willing to look). There's also true ENTERPRISE imageboard archivers like http://code.google.com/p/fuuka/ if you want to clone boards live as posts are updated, as well as track deleted posts and even allow users to post their own stuff. I remember writing my own dumper years ago, but I never ended up using it since my browser is really enough.
Name:
Anonymous2010-05-16 21:04
Here's some simple FIOC script I made four score and seven years ago. It rips all images from a thread into a predefined directory (by default it's "/Downloaded"; you can change it to whatever you want) and it can also delete any duplicates.
yes it's shit i know
#!/usr/bin/env python
import re
from urllib import *
import os
from hashlib import md5
import glob
import sys
thread = None
if len(sys.argv) > 1: thread = sys.argv[1]
def dupeChecker(directory = "/Downloaded/"):
deleteBool = input("Delete known dupes? 0 for no, 1 for yes: ")
hashArray = []
inputArray = []
knownDupes = []
for image in glob.glob("%s*"%directory):
# Reads the data contained in a file and gathers an md5 hash,
# then appends the hash and filename to two arrays
try:
contents = file(image).read()
m = md5()
m.update(contents)
currentHash = m.digest()
# Cycles through the hash arrays, searching for matches
for i, item in enumerate(hashArray):
for j, item2 in enumerate(hashArray):
if item == item2 and i != j and item2 not in knownDupes:
print "Dupe found: ", inputArray[i], "==", inputArray[j]
knownDupes.append(item)
knownDupes.append(item2)
# If the user chose to, it'll delete the dupe
if deleteBool == 1:
print inputArray[j], "is being deleted..."
os.remove(inputArray[j])
# Loads a page, then divides it up along quotation marks
page = urlopen(threadName).read()
page = page.split('"')
# Fetches the image directory
# Start by splitting the URL at each /, then cut it off when it reaches 'res' (which marks the thread number)
imageDir = threadName.split('/')
for word in imageDir:
if "res" in word:
i = imageDir.index(word)
# Join them and add 'src' to get the image directory
imageDir = '/'.join(imageDir[0:i])+"/src/"
if "boards" in imageDir:
imageDir = imageDir.replace('boards', 'images')
# Now search a page for anything in the image directory, then download it
for imageName in page:
# Looks for something starting with the imageDir and ending with .jpg
# It fills in the space [0-9]* between it with whatever image number
# [0-9] will match any digit, and * tells it to keep repeating
# A . tells it to look for anything, so we need to escape it with a \ to tell it to just use a period
# After the period, it'll look for any char [a to z], and search for a series of 3 with {3}
if re.search("^%s[0-9]*\.[a-z]{3}$"%imageDir, imageName):
if imageName not in alreadyDownloaded:
try:
print imageName, " has been downloaded"
image = URLopener()
# Saves images to the Downloaded folder
image.retrieve(imageName, "/Downloaded/"+imageName[-13:])
alreadyDownloaded.append(imageName)
j += 1
except: pass
print "%d images downloaded successfully"%j
imageDownloader(thread)
doCheck = input("Would you like to search for dupes? 0 for no, 1 for yes: ")
I wrote this a while back, it's in PHP but can run locally if you have Web Sharing enabled on your Mac and you edit the config to support PHP (It's already installed, just remove a # in httpd.conf).
It basically goes through a page and finds all links to stuff with the start of the URL as http://image.4chan.org/
Puts it all in a temporary folder then BZIP2's it up and sends it to the browser as a download (I designed it for a remote server)..