Monday, September 03, 2007

web crawler question

can search engines find pages that are not linked from anywhere else?

if so, what sort of crawler functionality do they need to achieve this.

4 comments:

nice try said...

in this particular case, the page used to linked from elsewhere.. so presumably the database of the search engine already knows of the existence of the url and now simply refreshes that pages contents

(it is still strange that the page didnt show up in the search results for a brief while when it wasnt linked and suddenly resurfaced even though nothing else changed)

Sanketh said...

DNS records, if someone is willing to give them out all at once.

nice try said...

DNS makes sense for standalone websites.

the object in question is a html document in my public www directory on a www.blah.edu webserver.

Arvind Narayanan said...

opera sends your browsing history to google. at least used to, a few years ago.