/prog/ - Circumvent block on page downloading

Name: Anonymous 2013-12-13 16:39

I'm trying to use PHP to let users enter URLs of a specific domain and return data about the content of that page, but the domain blocks programs that try to "download pages from this site automatically." Any ideas on how I might circumvent this?

Name: Anonymous 2013-12-13 16:48

CIRCUMVENT MY ANUS

Name: Anonymous 2013-12-13 16:56

Easy.

Fake Googlebot's user agent. Webmasters of shitty websites don't block it.

Name: Anonymous 2013-12-13 17:05

>>3
hmmm I'll look into this. thanks

Name: Anonymous 2013-12-13 18:01

The data I'm trying to use is numerical, and it looks like Googlebot's user agent overlooks it because it's not words. Any other ideas?

Name: Anonymous 2013-12-13 18:04

install gentoo

Name: Anonymous 2013-12-13 18:14

>>6
wat. how would that help

Name: Anonymous 2013-12-14 5:07

OP use cURL binding in your programming language. It lets you set up a HTTP client that behaves almost like a real browser (with cookies etc) except without Javascript support.

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2013-12-14 7:05

>>8
And don't forget to switch the bloody user-agent!

Name: Anonymous 2013-12-14 7:11

User-Agent: HAXMYANUS

Name: Anonymous 2013-12-14 8:38

The data I'm trying to use is numerical, and it looks like Googlebot's user agent overlooks it because it's not words. Any other ideas?

I don't know what you did. I was just advicing you to write a simple scraper and change the user agent in the requests to "Googlebot". Something like: curl_setopt($ch, CURLOPT_USERAGENT, "Googlebot/2.1 (+http://www.googlebot.com/bot.html)");

https://en.wikipedia.org/wiki/HTTP_request
https://en.wikipedia.org/wiki/List_of_HTTP_header_fields

Name: Anonymous 2013-12-14 8:39

*advising

Circumvent block on page downloading

1 Name: Anonymous 2013-12-13 16:39

2 Name: Anonymous 2013-12-13 16:48

3 Name: Anonymous 2013-12-13 16:56

4 Name: Anonymous 2013-12-13 17:05

5 Name: Anonymous 2013-12-13 18:01

6 Name: Anonymous 2013-12-13 18:04

7 Name: Anonymous 2013-12-13 18:14

8 Name: Anonymous 2013-12-14 5:07

9 Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2013-12-14 7:05

10 Name: Anonymous 2013-12-14 7:11

11 Name: Anonymous 2013-12-14 8:38

12 Name: Anonymous 2013-12-14 8:39

13 Name: 2013-12-16 15:23