email: ebobnar at gmail.com

Affordable SEO Consulting for the Results-Oriented Online Business.

Welcome to my blog. Here's a list of my best posts, as well as a complete archive of everything I've ever blogged about.


« The Slow Death of the Trackback | Main | The No-Internet Blues »

April 24, 2005

Speculation on the Google 101k Filesize Limit

Stumbled upon a post recently over at SearchEngineWatch which discussed the occasional appearance of cached files in Google far exceeding their 101k file size limit. Doing a little searching, I found the following:

That's a filesize of 513k. However, when I checked the actual cached file itself, it was still only 101k. So what's an earnest search geek to think? Most likely, Google's getting ready to index some larger file sizes, and we're seeing the first inkling of that through some test cases. But it hasn't been rolled out quite yet.

Here's a list of the major search engines and approx. the maximum HTML file size they'll index. Note that these numbers apply to HTML file sizes. Search egines have been know to index considerably larger files when it comes to PDF and DOC file types, among others.

I found a few Yahoo pages claiming cached pages as big as 725k, but, when measured, the actual cached pages topped out around 240k.

Incidentally, I noticed that even though SearchEngineWatch has taken trackbacks off their blog, they've left them up on their old posts. Seems a bit strange, since that would be the place a smart trackback spammer would be most likely to strike. Maybe they just havn't gotten around to it yet. Personally, I hope they bring them back. I think people are a bit too spam paranoid these days, and it cuts into the community nature of the web.