The Invisible Web

Are the main search engines failing us?

Introduction

The growth of content on the Internet has grown exponentially within the past few years. Goldmines of information exist on the Net, if only you could find them! The worlds leading search engine, Google.com, currently provides links to over 1.6 billion web pages. Even this staggering amount of data is nothing compared to the amount of content that is not indexed by the leading search engines, often referred to as the 'Invisible Web'.

Much of this information is in the form of publicly available databases, such as phone directories, newspaper archives or medical dictionaries. Due to the specialist nature of some of these databases, the content contained within them tends to be exactly what you are looking for and can be of a very high standard.

Search Engine Blues

Much of the difficulty of not being able to access the Invisible Web is the fault of the major search engines. Most major search engines employ automated programs called 'Spider Bots' or 'Crawlers' that literally 'crawl' their way through the Internet from link to link, indexing web pages as they hit on them. Unfortunately, if no links are found to a particular web site, the site will fall through the cracks into the Invisible Web, beyond the reach of the general Net community who miss out.

Furthermore, even if a web site administrator submits his/her site to a major search engine, this is no guarantee that their site will be indexed. In fact, many search engines take several attempts to register a site, and on average take two to four weeks to process a submitted site. In the face of such problems, many sites go unchecked by the main search engines and fall through the cracks.

Search the Invisible Web

InvisibleWeb.com is a good place to start your search for more specialised information. It call's itself "the search engine of all search engines", and so long as you know roughly what you are looking for, it will provide you with a list of relevant sites to continue your search in more detail. For example, if you are looking for newspaper reports on famous incidents then you will only be provided with links for recognised newspaper archives, no crazy unrelated links that normal search engines are so fond of throwing up at you.

The reality is that it is quite easy for anyone to set up a site and keep in quite covert in terms of the public's awareness of it. So long as the search engines cannot find it through links, and it is never submitted by anyone to the engines, it will effectively remain outside the public domain.

Considering that it is estimated that the Invisible Web is growing at a much faster rate than the rest of the Internet, it is possible to imagine a situation in the near future where the major search engines will simple not be able to cope with the pace of Internet expansion. Lets hope that they never become complacent about their technology, I for one will stick with Google as my default home page.

John Collins

I have been writing about web technology and software development since 2001. I am the developer of the Alpha Framework for PHP, and the five.today personal productivity app. I love open source, technology, and economics.