Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon.

Pages: 1-

Mac Thread Ripper

Name: Anonymous 2010-05-16 20:51

I was directed here to get one. I'm looking for something that'll take the images from a thread and download them a la 4chan downloader.exe

I use Firefox but will switch to chrome/safari if needed for some bizarre reason. If ,"payment" is required I can start and /s/ thread

Name: Anonymous 2010-05-16 20:53

RIP MY ANUS

Name: Anonymous 2010-05-16 20:55

I will if you do this for me.

Name: Anonymous 2010-05-16 20:56

THREAD MY ANUS

Name: Anonymous 2010-05-16 20:56

Just learn Ruby, get TextMate, and write it yourself.

Name: Anonymous 2010-05-16 20:57

Shit man I leave in a day I don't have time for that.

Name: Anonymous 2010-05-16 21:03

Pretty sure both Firefox and Opera can do this if you're browsing 4chan with them. If for whatever reason you want something standalone, you'll have to write it yourself (there's probably hundreds of these utils written already: some people posted some on /prog/ not long ago if you're willing to look). There's also true ENTERPRISE imageboard archivers like http://code.google.com/p/fuuka/ if you want to clone boards live as posts are updated, as well as track deleted posts and even allow users to post their own stuff. I remember writing my own dumper years ago, but I never ended up using it since my browser is really enough.

Name: Anonymous 2010-05-16 21:04

Here's some simple FIOC script I made four score and seven years ago. It rips all images from a thread into a predefined directory (by default it's "/Downloaded"; you can change it to whatever you want) and it can also delete any duplicates.

yes it's shit i know

#!/usr/bin/env python

import re
from urllib import *
import os
from hashlib import md5
import glob
import sys

thread = None
if len(sys.argv) > 1: thread = sys.argv[1]

def dupeChecker(directory = "/Downloaded/"):
    deleteBool = input("Delete known dupes? 0 for no, 1 for yes: ")
    hashArray = []
    inputArray = []
    knownDupes = []
    for image in glob.glob("%s*"%directory):
        # Reads the data contained in a file and gathers an md5 hash,
        # then appends the hash and filename to two arrays
        try:
            contents = file(image).read()
            m = md5()
            m.update(contents)
            currentHash = m.digest()

            hashArray.append(currentHash)
            inputArray.append(image)
        except:
            continue   
           
    # Cycles through the hash arrays, searching for matches
    for i, item in enumerate(hashArray):
        for j, item2 in enumerate(hashArray):
            if item == item2 and i != j and item2 not in knownDupes:
                print "Dupe found: ", inputArray[i], "==", inputArray[j]
                knownDupes.append(item)
                knownDupes.append(item2)
                # If the user chose to, it'll delete the dupe
                if deleteBool == 1:
                    print inputArray[j], "is being deleted..."
                    os.remove(inputArray[j])

def imageDownloader(thread):
    j = 0
    alreadyDownloaded = []
    if thread != None:
        threadName = str(thread)
    else:
        threadName = raw_input("\nInput a thread URL: ")   

    # Loads a page, then divides it up along quotation marks
    page = urlopen(threadName).read()
    page = page.split('"')

    # Fetches the image directory
    # Start by splitting the URL at each /, then cut it off when it reaches 'res' (which marks the thread number)
    imageDir = threadName.split('/')
    for word in imageDir:
        if "res" in word:
            i = imageDir.index(word)
    # Join them and add 'src' to get the image directory
    imageDir = '/'.join(imageDir[0:i])+"/src/"
    if "boards" in imageDir:
        imageDir = imageDir.replace('boards', 'images')

    # Now search a page for anything in the image directory, then download it
    for imageName in page:
        # Looks for something starting with the imageDir and ending with .jpg
        # It fills in the space [0-9]* between it with whatever image number
        # [0-9] will match any digit, and * tells it to keep repeating
        # A . tells it to look for anything, so we need to escape it with a \ to tell it to just use a period
        # After the period, it'll look for any char [a to z], and search for a series of 3 with {3}
        if re.search("^%s[0-9]*\.[a-z]{3}$"%imageDir, imageName):
            if imageName not in alreadyDownloaded:
                try:
                    print imageName, " has been downloaded"
                    image = URLopener()
                    # Saves images to the Downloaded folder
                    image.retrieve(imageName, "/Downloaded/"+imageName[-13:])
                    alreadyDownloaded.append(imageName)
                    j += 1
                except: pass
    print "%d images downloaded successfully"%j

imageDownloader(thread)

doCheck = input("Would you like to search for dupes? 0 for no, 1 for yes: ")

if doCheck == 1:
    dupeChecker()

Name: Anonymous 2010-05-16 21:05

someone posted an example of doing this with wget not long ago

Name: Anonymous 2010-05-16 22:49

What happened to DON'T HELP HIM? Let's not encourage this imageboard effluent.

Name: Anonymous 2010-05-16 22:59

If you have not set up you're mac so that double-clicking an exe makes wine happen, maybe you should use a wii instead.

Name: Anonymous 2010-05-16 23:23

You're suggestions all suck compared to

http://mrfreeze.github.com/ThreadWatcher/

Name: Anonymous 2010-05-17 0:09

>>12
You're website's button's are fucked up.

Name: Anonymous 2010-05-17 0:17

>>12
The wget suggestion was better than that.

Name: Anonymous 2010-05-17 0:36

Name: Anonymous 2010-05-17 3:02

>>11
The Wii is nice. Maybe you should get some friends.

Name: air max shoes 2010-07-23 11:01

http://www.cheapairmaxs.com air max
http://www.cheapairmaxs.com air max shoes
http://www.cheapairmaxs.com/nike-air-max-2012-c-111.html nike air max 2012
http://www.cheapairmaxs.com/mens-air-max-2010-c-93.html mens nike air max 2010
http://www.cheapairmaxs.com/womens-air-max-2010-c-96.html womens nike air max 2010
http://www.cheapairmaxs.com/mens-air-max-2009-c-95.html mens nike air max 2009
http://www.cheapairmaxs.com/womens-air-max-2009-c-98.html womens nike air max 2009
http://www.cheapairmaxs.com/nike-air-max-2003-c-101.html nike air max 2003
http://www.cheapairmaxs.com/nike-air-max-97-c-94.html nike air max 97
http://www.cheapairmaxs.com/mens-air-max-95-c-102.html mens nike air max 95
http://www.cheapairmaxs.com/womens-air-max-95-c-103.html womens nike air max 95
http://www.cheapairmaxs.com/nike-air-max-93-c-106.html nike air max 93
http://www.cheapairmaxs.com/mens-air-max-91-c-104.html mens nike air max 91
http://www.cheapairmaxs.com/womens-air-max-91-c-105.html womens nike air max 91
http://www.cheapairmaxs.com/nike-air-max-89-c-121.html nike air max 89
http://www.cheapairmaxs.com/nike-air-max-88-c-112.html nike air max 88
http://www.cheapairmaxs.com/mens-air-max-87-c-108.html mens nike air max 87
http://www.cheapairmaxs.com/womens-air-max-87-c-109.html womens nike air max 87
http://www.cheapairmaxs.com/nike-air-max-180-c-123.html nike air max 180
http://www.cheapairmaxs.com/nike-air-max-360-c-124.html nike air max 360
http://www.cheapairmaxs.com/mens-air-max-ltd-c-122.html mens air max ltd
http://www.cheapairmaxs.com/womens-air-max-ltd-c-116.html womens air max ltd
http://www.cheapairmaxs.com/nike-air-max-bw-c-117.html nike air max bw
http://www.cheapairmaxs.com/air-max-premium-c-118.html air max premium
http://www.cheapairmaxs.com/air-max-skyline-c-114.html air max skyline
http://www.cheapairmaxs.com/air-max-zenyth-c-125.html air max zenyth
http://www.cheapairmaxs.com/nike-air-max-tn-c-115.html nike air max tn
http://www.cheapairmaxs.com/kids-air-max-90-c-119.html kids air max 90
http://www.cheapairmaxs.com/kids-air-max-bw-c-120.html kids air max bw

Name: StonedOnAJAX 2010-07-23 17:58

I wrote this a while back, it's in PHP but can run locally if you have Web Sharing enabled on your Mac and you edit the config to support PHP (It's already installed, just remove a # in httpd.conf).

It basically goes through a page and finds all links to stuff with the start of the URL as http://image.4chan.org/

Puts it all in a temporary folder then BZIP2's it up and sends it to the browser as a download (I designed it for a remote server)..

Give it a try and see what you think: http://dl.sassybox.net/chan/4Chan%20PHP%20Thread%20Ripper.zip

Name: Anonymous 2010-12-22 5:27

Name: Anonymous 2011-02-03 6:01

Don't change these.
Name: Email:
Entire Thread Thread List