Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon. Entire thread

YouPorn Python API

Name: Anonymous 2009-04-14 18:02

I programmed one, who's interested?

Name: Anonymous 2009-04-14 18:03

s/Python/FIOC

What functionality does it have? and yes.

Name: Anonymous 2009-04-14 18:05

How do I internet API in HaskEll?

Name: Anonymous 2009-04-14 18:07

Take a look at it yourself. It isn't much, just the basics you can do via the homepage. Probably not even enough to call it API or something similiar. Anyways, have fun with it:
#!/usr/bin/env python

import html5lib
import mechanize

URL = "http://youporn.com/"
ENTER_URL = "%s?user_choice=Enter" % URL
BROWSE_URL = "%sbrowse/%s?page=%s" % (URL, "%s", "%d")
TOP_RATED_URL = "%stop_rated/%s?page=%s" % (URL, "%s", "%d")
MOST_VIEWED_URL = "%smost_viewed/%s?page=%s" % (URL, "%s", "%d")
SEARCH_URL = "%s%s?query=%s&type=%s&page=%s" % (URL, "%s", "%s", "%s", "%d")

def _join_url(a, *p):
    path = a
    for b in p:
        if b.startswith('/'):
            path = b
        elif path == '' or path.endswith('/'):
            path +=  b
        else:
            path += '/' + b
    return path


class YouPorn(object):
    def __init__(self):
        self._browser = mechanize.Browser()
        self._browser.addheaders = []
        self._enter()

    def _enter(self):
        self._browser.open(ENTER_URL)

    @staticmethod
    def _filter_videos(soup):
        watch = lambda href: href and "/watch/" in href
        videos = []
        for a in soup.findAll("a", {"href":watch}):
            videos.append(_join_url(URL, a["href"]))
        return videos

    def get_newest_videos(self, page=1, sort_by="rating"):
        return self._filter_videos(html5lib.parse(self._browser.open(
            BROWSE_URL % (sort_by, page)), "beautifulsoup"))

    def get_top_rated(self, page=1, sort_by="week"):
        return self._filter_videos(html5lib.parse(self._browser.open(
            TOP_RATED_URL % (sort_by, page)), "beautifulsoup"))

    def get_most_viewed(self, page=1, sort_by="week"):
        return self._filter_videos(html5lib.parse(self._browser.open(
            MOST_VIEWED_URL % (sort_by, page)),"beautifulsoup"))

    def search(self, query, page=1, sort_by="relevance", type="straight"):
        return self._filter_videos(html5lib.parse(self._browser.open(
            SEARCH_URL % (sort_by, query, type, page)), "beautifulsoup"))

    def download_video(self, url):
        soup = html5lib.parse(self._browser.open(url), "beautifulsoup")
        download = lambda href: "/download/" in href
        download_url = soup.find("a", {"href":download})["href"]
        self._browser.retrieve(download_url,
            self._browser.geturl().split("/")[-2] + ".flv")


def main():
    youporn = YouPorn()
    for video in youporn.get_most_viewed(sort_by="all")[1:]:
        print "Downloading %s..." % video
        youporn.download_video(video)

if __name__ == "__main__":
    main()


If I were you, had too less to do and were bored, I would probably implement a Video class.

Name: Anonymous 2009-04-14 18:25

>>4
Your code sucks and I hate you. You're just repeating the same crap over and over. If you're going to write Java, do it in Java.

Name: Anonymous 2009-04-14 18:31

>>5
Do it better, idiot.

Name: Anonymous 2009-04-14 18:48

>>4
Not bad at all, though to be honest I have a girlfriend, so this will be of limited use for me.

Name: Anonymous 2009-04-14 18:54

>>7
Thanks a lot for the compliment.

Name: Anonymous 2009-04-14 19:13

>>6
Concat an URL address is hard HURF DURF, maybe you should answer >>2 before posting your first year intro to python crapware

Name: Anonymous 2009-04-14 19:16

>>9
Your mother is hard HURF DURF?

Name: Anonymous 2009-04-14 19:18

>>9
an URL
Joke post

Name: Anonymous 2009-04-14 19:35

>>6
I couldn't be bothered to actually write my own. I just cleaned up some of your mess (and no, I haven't tested it).

#!/usr/bin/env python

from urlparse import urljoin

# Do they actually use HTML5 or are you just an idiot?
from html5lib import HTMLParser
from html5lib.treebuilders import getTreeBuilder

import mechanize

URL = 'http://youporn.com/'
ENTER_URL = '%s?user_choice=Enter' % URL
BROWSE_URL = '%sbrowse/%s?page=%s' % (URL, '%s', '%d')
TOP_RATED_URL = '%stop_rated/%s?page=%s' % (URL, '%s', '%d')
MOST_VIEWED_URL = '%smost_viewed/%s?page=%s' % (URL, '%s', '%d')
SEARCH_URL = '%s%s?query=%s&type=%s&page=%s' % (URL, '%s', '%s', '%s', '%d')

class YouPorn(object):
    def __init__(self):
        self.parser = HTMLParser(tree=getTreeBuilder('beautifulsoup'))
        self.browser = mechanize.Browser()
        self.browser.addheaders = []
        self.browser.open(ENTER_URL)

    def filter(self, url):
        watch = lambda href: href and '/watch' in href
        soup = self.parser.parse(self.browser.open(url))

        return [urljoin(URL, a['href']) for
                a in soup.findAll('a', {'href': watch})]

    def download(self, url):
        download = lambda href: '/download/' in href
        soup = self.parser.parse(self.browser.open(url))

        download_url = soup.find('a', {'href': download})['href']
        filename = url.split('/')[-2] + '.flv'
        self.browser.retrieve(download_url, filename)

    def newest(self, page=1, sort_by='rating'):
        return self.filter(BROWSE_URL % (sort_by, page))

    def top_rated(self, page=1, sort_by='week'):
        return self.filter(TOP_RATED_URL % (sort_by, page))

    def most_viewed(self, page=1, sort_by='week'):
        return self.filter(MOST_VIEWED_URL % (sort_by, page))

    def search(self, page=1, sort_by='relevance', type='straight'):
        return self.filter(SEARCH_URL % (sort_by, query, type, page))

def main():
    youporn = YouPorn()
    for video in youporn.most_viewed(sort_by='all')[1:]:
        print 'Downloading %s...' % video
        youporn.download(video)

if __name__ == '__main__':
    main()

Name: Anonymous 2009-04-14 19:41

>>12
That looks... exactly the same

Name: Anonymous 2009-04-14 19:42

>>13
Try opening your eyes.

Name: Anonymous 2009-04-14 19:49

My eyes are open

Name: Anonymous 2009-04-14 19:51

>>15
Have you read SICP?

Name: Anonymous 2009-04-14 19:54

>>16
Have you felt SICP IN BRAILLE!?!??

Name: Anonymous 2009-04-14 20:05

>>17
Have YOU heard SICP in Morse?

Name: Anonymous 2009-04-14 20:10

>>18
Have you seen your SICP smoke signals?

Name: Anonymous 2009-04-14 20:10

>>18
Have you felt the warmth of THE SUSSMAN INSIDE OF YOU?

Name: Anonymous 2009-04-14 20:17

>>13
$ diff old.py new.py
3,4c3,7
< import html5lib
< import mechanize
---
from urlparse import urljoin

# Do they actually use HTML5 or are you just an idiot?
from html5lib import HTMLParser
from html5lib.treebuilders import getTreeBuilder
6,22c9
< URL = "http://youporn.com/"
< ENTER_URL = "%s?user_choice=Enter" % URL
< BROWSE_URL = "%sbrowse/%s?page=%s" % (URL, "%s", "%d")
< TOP_RATED_URL = "%stop_rated/%s?page=%s" % (URL, "%s", "%d")
< MOST_VIEWED_URL = "%smost_viewed/%s?page=%s" % (URL, "%s", "%d")
< SEARCH_URL = "%s%s?query=%s&type=%s&page=%s" % (URL, "%s", "%s", "%s", "%d")
<
< def _join_url(a, *p):
<     path = a
<     for b in p:
<         if b.startswith('/'):
<             path = b
<         elif path == '' or path.endswith('/'):
<             path +=  b
<         else:
<             path += '/' + b
<     return path
---
import mechanize
23a11,16
URL = 'http://youporn.com/'
ENTER_URL = '%s?user_choice=Enter' % URL
BROWSE_URL = '%sbrowse/%s?page=%s' % (URL, '%s', '%d')
TOP_RATED_URL = '%stop_rated/%s?page=%s' % (URL, '%s', '%d')
MOST_VIEWED_URL = '%smost_viewed/%s?page=%s' % (URL, '%s', '%d')
SEARCH_URL = '%s%s?query=%s&type=%s&page=%s' % (URL, '%s', '%s', '%s', '%d')
27,63c20,34
<         self._browser = mechanize.Browser()
<         self._browser.addheaders = []
<         self._enter()
<
<     def _enter(self):
<         self._browser.open(ENTER_URL)
<
<     @staticmethod
<     def _filter_videos(soup):
<         watch = lambda href: href and "/watch/" in href
<         videos = []
<         for a in soup.findAll("a", {"href":watch}):
<             videos.append(_join_url(URL, a["href"]))
<         return videos
<
<     def get_newest_videos(self, page=1, sort_by="rating"):
<         return self._filter_videos(html5lib.parse(self._browser.open(
<             BROWSE_URL % (sort_by, page)), "beautifulsoup"))
<
<     def get_top_rated(self, page=1, sort_by="week"):
<         return self._filter_videos(html5lib.parse(self._browser.open(
<             TOP_RATED_URL % (sort_by, page)), "beautifulsoup"))
<
<     def get_most_viewed(self, page=1, sort_by="week"):
<         return self._filter_videos(html5lib.parse(self._browser.open(
<             MOST_VIEWED_URL % (sort_by, page)),"beautifulsoup"))
<
<     def search(self, query, page=1, sort_by="relevance", type="straight"):
<         return self._filter_videos(html5lib.parse(self._browser.open(
<             SEARCH_URL % (sort_by, query, type, page)), "beautifulsoup"))
<
<     def download_video(self, url):
<         soup = html5lib.parse(self._browser.open(url), "beautifulsoup")
<         download = lambda href: "/download/" in href
<         download_url = soup.find("a", {"href":download})["href"]
<         self._browser.retrieve(download_url,
<             self._browser.geturl().split("/")[-2] + ".flv")
---
        self.parser = HTMLParser(tree=getTreeBuilder('beautifulsoup'))
        self.browser = mechanize.Browser()
        self.browser.addheaders = []
        self.browser.open(ENTER_URL)

    def filter(self, url):
        watch = lambda href: href and '/watch' in href
        soup = self.parser.parse(self.browser.open(url))

        return [urljoin(URL, a['href']) for
                a in soup.findAll('a', {'href': watch})]

    def download(self, url):
        download = lambda href: '/download/' in href
        soup = self.parser.parse(self.browser.open(url))
64a36,50
        download_url = soup.find('a', {'href': download})['href']
        filename = url.split('/')[-2] + '.flv'
        self.browser.retrieve(download_url, filename)

    def newest(self, page=1, sort_by='rating'):
        return self.filter(BROWSE_URL % (sort_by, page))

    def top_rated(self, page=1, sort_by='week'):
        return self.filter(TOP_RATED_URL % (sort_by, page))

    def most_viewed(self, page=1, sort_by='week'):
        return self.filter(MOST_VIEWED_URL % (sort_by, page))

    def search(self, page=1, sort_by='relevance', type='straight'):
        return self.filter(SEARCH_URL % (sort_by, query, type, page))
68,70c54,56
<     for video in youporn.get_most_viewed(sort_by="all")[1:]:
<         print "Downloading %s..." % video
<         youporn.download_video(video)
---
    for video in youporn.most_viewed(sort_by='all')[1:]:
        print 'Downloading %s...' % video
        youporn.download(video)
72c58
< if __name__ == "__main__":
---
if __name__ == '__main__':
74d59
<

Name: Anonymous 2009-04-14 20:19

>>21
That was really helpful. Thank you.

Name: RedCream 2009-04-14 20:26

class YouPorn(object):
    def __init__(self):
        self.parser = HTMLParser(Bark=getBarkBuilder('beautifulsoup'))
        self.browser = mechanize.Browser()
        self.browser.addheaders = []
        self.browser.open(THEFUCK_ON)

    def filte1r(self, url):
        watch = lambda href: href and '/watch' in href
        soup = self.parser.parse(self.browser.open(url))

        return [urljoin(URL, a['href']) for
                a in soup.findAll('a', {'href': watch})]

    def download(self, url):
        download = lambda href: '/download/' in href
        soup = self.parser.parse(self.browser.open(url))

        download_url = soup.find('a21', {'href': download})['href']
        filename = url.spli4t('/')[-312] + '.flv'
        self.browser.retrieve(download_url, filename)

    def newest(self, page=1111111111111, sort_by='rating'):
        return self.filter(BROWSE_URL % (sort_by, page))

    def top_rated(self, page=1, sort_by='week'):
        return self.filter(TOP_RATED_URL % (sort_by, lolpage))

    def most_viewed(self, page=1, sort_by='week'):
        return self.filter(MOST_VIEWED_URL % (sort_by, page))

    def search(self, page=1, wut_by='adhere', type='straight'):
        return self.filter(SEARCH_URL % (play_house_disney_fuck_yea))

def main():
    youporn = Youyou()
    for video in youyou.most_viewed(sort_by='all')[1:]:
        print 'Downloading %s...' % video
        youporn.download(video)

if __name__ == '__main__':
    main()

Name: Anonymous 2009-04-14 20:28

>>23
Now we're talking.

Name: Anonymous 2009-04-14 20:40

>>23
Now you're thinking with portals

Name: Anonymous 2009-04-14 21:11

>>21
It's still exactly the same

Name: Anonymous 2009-04-15 17:55

>>21
Oh my gawd... Oh my gaawdd *facepalm*.
The html5lib is for parsing all HTML pages you fucking idiot. God, please name me one person which sucks more (is that even possible). I'm using html5lib because it uses the same methods to parse a webpage like, for example, firefox and so should be aple to parse nearly all webpages out there you fucking idiot dick.

Furthermore you just shifted some code into other methods and renamed some parts. You made .filter a public method even though it would be better if it were private because the people who want to use the module aren't interested in the filter method but rather in the download method. As the name says a filter method filters a certain pattern out of something, you though receive the whole page in the filter method first instead of only filtering it.

Furthermore you renamed the get_* methods to *? I mean WTF? First, the API user doesn't 100% know what this method does anymore and second, do you really think that the code just got better because you renamed the fucking method you douchebag?

On top of that you told me >>5 here that I was repeating the same coder "over and over again". Do you do anything else in your code, troll? The only thing you added that maybe made sense was the urljoin method (wow).

Thanks for messing up my code though and spreading faggotry, idiot.

Name: Anonymous 2009-04-15 18:04

>>27
54c54
<     def search(self, query, page=1, sort_by="relevance", type="straight"):
---
    def search(self, query, page=1, sort_by="relevance", type="gay"):

Name: Anonymous 2009-04-15 18:11

>>28
Wow, you are really cool :D... not
THE GAME

Name: Anonymous 2009-04-15 18:14

>>29
You should stick to Java. It seems that Python is too much for you to handle.

Name: Anonymous 2009-04-15 18:15

>>30
You should stick your penis in your mother's pooper, seems like a real woman and a real vagoo is too much for you to handle.

Name: Anonymous 2009-04-15 18:18

>>31
Most readers of this post would probably fare rather poorly if they were dropped in a jungle and told to catch dinner with their bare hands.

Name: Anonymous 2009-04-15 18:19

>>27
It's refreshing to see someone be trolled so thoroughly.

Name: Anonymous 2009-04-15 18:23

>>33
Yeah. I had almost given up hope on this, but it turned out almost too well.

Name: Anonymous 2009-04-15 18:55

lol noobs get trold

Name: Anonymous 2009-04-15 20:07

It's sad that the INTRO TO PROGRAMMING IN PYTHON faggot has like 50% of the posts in this thread.

Your ``API'' is bad and you should feel bad. Go away.

Name: Anonymous 2009-04-15 21:11

>>27
Thanks, I lol'd hard at this. You made my day.

Name: Anonymous 2009-04-15 21:26

Oh my gawd... Oh my gaawdd *facepalm*.

Name: Anonymous 2009-04-15 23:31

>>27
His version is shorter by 15 lines. You are clearly feeling bad that he exposed just how much bloat you produced.

Name: Anonymous 2010-11-14 6:11

Newer Posts
Don't change these.
Name: Email:
Entire Thread Thread List