ah, mephistophelis.: Hack A Search Result Back From That Stupid "About:Blank' Page Into The Cache Google Promised You

...because when it comes to webcache.googleusercontent.com, "about:blank" always lies.

Well, almost always.

This is not a post on Google's cache per se, just on that stupid, stupid about:blank page, which I have been tempted lo these many years to call a tool betraying the hand of the censor, but, seeing as I have not the proof that would be proof and (though we all suspect it) that's a heavy charge -- well, I make no such claim. Wouldn't dream of it.

Picture this: There you are, grooving along quietly to yourself, O Intrepid Yet Gentle Reader, happily enjoying an Internet still (mostly) as Free as Our Creator Intended, making use of your mind, exploiting your curiosity, following your particular whim, building proof of an hypothesis, indulging in a fetish, uncovering published secrets, or what-have-you. You expect a certain page to be 404 file-not-found, or you want the page before the latest update, or redaction, or backpedaling, or perhaps the information was only accidently disclosed, and has since been hastily re-sequestered in the bowels of a secret directory of some shamelessly vast database.

But: Google, or specifically, "webcache.googleusercontent.com" promises you a cache. You don't even have to deal with the disappointment that has become the Internet Archive's Wayback Machine ever since the new format (don't get me started about censorship I wish I could prove; I would rap all their noses with the sturdiest rolled up eff.org rss feed until they retreated, whining, to fix the lame fascism serving bugs in the code).

Ahem. You know that one of the following is true about the page you want

it was cached only recently
it was pulled oh so recently --
hardly a soul accesses it --
you know a search string uncommon enough that few have used it:
you know of a mirror of the page likely to be crawled, and thus cached, less often

In any event, you bet the chances are good it will be intact and whole in Google's cache of the page. Eagerly you click on it when BAM!

about:blank.

I have never met an about-colon-blank outcome to a google search result that was NOT a complete and ugly LIE.

The usual reason has to do, or appears to have to do (it could be magick for all I know) with the concept of HTTP. HTTP stands for Hyper Text Transfer Protocol. It is the Protocol, or 'convention,' marking up or giving stage directions to the hyper, or 'meta,' text. Alphanumeric symbolic squiggles virtually represented on your computer screen in a nostalgic ritual nod to the actual ink-and-paper hard copy -- which hard copy more and more of us do not know to produce and for which I hope the day does not arrive when we will all wish we had had the aptitude --

-- (takes breath) -- and HTTP is the Protocol, or 'convention,' by which hyper, or 'meta,' text, or words about words are ...transferred, or 'transmitted' between domains.

Here is my* theory behind why this works:

Webcache.googleusercontent.com is a sub-domain of Google.com. This is from whence the cached pages one sees as the result of a google search. In retrieving it Google need only transfer within, not across: no need for that HTTP.

In fact, when you REMOVE it from an about:blank cache page, what do you usually get? BOOM. What you wanted.

Here's a video tutorial.

Now I must add that I think this reason is simplistic, as adding it does not always produce an about:blank -- certainly not when you already have gotten the page, or when you leave the hash in. But it's a good way to remember it.

*h/t to Corporanon

Be seeing you.

Hack A Search Result Back From That
Stupid "About:Blank' Page
Into The Cache Google Promised You

No comments:

Post a Comment