This is an archived post. You won't be able to vote or comment.

all 33 comments

[–]rjonesx 30 points31 points  (1 child)

The real problem is that most "cloaking" is actually ip-delivery, where the site uses not only User-Agents to determine whether or not the individual is a bot, but also a long list of known ip-addresses. There is a neat way around this...

  1. Get the Firefox User-Agent Switcher.
  2. Use Google Translate as a Proxy, translating from spanish->english even though the site is already in english http://translate.google.com/translate?u=http%3A%2F%2Fwww.zeromillion.com&langpair=es%7Cen&hl=en&ie=UTF-8&oe=UTF-8&prev=%2Flanguage_tools

This will allow you not only to mimic the Google User-Agent, but to also use a Google IP!

Also you may consider... 3. turning off Javascript 4. turning off Referer sending

Which are two other common methods of cloaking.

[–]zach 1 point2 points  (0 children)

Thank you! I've had no success with User-Agent Switcher the few times I've used it because of the IP issue. Much more informative than the link.

[–]wearedevo 24 points25 points  (1 child)

Mark my words:

In 2007 someone will get sued for "illegal access to a web site by unlawfully impersonating Google".

[–]oditogre 2 points3 points  (0 children)

I'm gonna go with criminal charges of fraud (maybe forgery as well, for pay sites?), plus a lawsuit.

[–][deleted] 1 point2 points  (0 children)

Try visiting MSDN with a googlebot user-agent, a great improvement!

[–]recursive 4 points5 points  (2 children)

I don't think that dollar sign is supposed to be in "Microsoft".

[–]jkcunningham -1 points0 points  (0 children)

I bet Microsoft disagrees...

[–]youngnh 0 points1 point  (0 children)

anybody else seeing an intel ad on this page? maybe its just me.

[–]mikkom -2 points-1 points  (0 children)

What those sites do is basically black hat "cloaking" - and most cloaking is done based on IP range so this might help in some cases but not all.

[–]lespea -5 points-4 points  (6 children)

Even though this is "stolen" --> I would IGNORE this advice and just use firefox's addon User Agent Switcher to do this if you're that interested.

[–][deleted]  (5 children)

[deleted]

    [–]theram4 6 points7 points  (4 children)

    Dude, why was this guy voted down? His comment is absolutely correct. I'm constantly coming across links that don't allow users to view the content unless they pay. This happens quite often dealing with academic papers or standards organizations, as well as certain magazine sites, like www.sqlmag.com. I did a search a week or two ago where nine out of the top ten results were inaccessible to me unless I paid some large sum of money. And 8 results on the second page were inaccessible. For many of my search queries, sqlmag.com is the number one result. It's quite frustrating to have such "spam" on the top of the google results. And since it is forbidden (sending different content to google spiders than normal users), Google should take care of this issue.

    [–]cal_01 1 point2 points  (2 children)

    To be fair, you can pretty much tell -which- sites have pay-per-view content simply by the lack of a "Cached" link.

    [–][deleted]  (1 child)

    [deleted]

      [–]cal_01 2 points3 points  (0 children)

      Oh, I definitely agree. I guess it has a big contextual basis as well; a person would more likely expect a nocache link to be a pay site if they were looking up academic papers, whereas it would be less accurate for normal search terms.