The Invisible Web: Searching the Hidden Parts of the Web

David Mason (Victoria University of Wellington, New Zealand)

The Electronic Library

ISSN: 0264-0473

Article publication date: 1 February 2002

164

Keywords

Citation

Mason, D. (2002), "The Invisible Web: Searching the Hidden Parts of the Web", The Electronic Library, Vol. 20 No. 1, pp. 58-62. https://doi.org/10.1108/el.2002.20.1.58.3

Publisher

:

Emerald Group Publishing Limited


Everyone has used a search engine, and most of us pride ourselves on our ability to find things on the Internet. This book introduces one to the parts of the Internet that search engines cannot reach – the deep Internet. Many people are only vaguely aware that it exists. There is much that is valuable and easily found on the Internet, but there is also a large and growing source of high quality information that is hidden from even the best search engines. Various estimates put the volume of the hidden Internet at up to 2,000 times the size of the surface Internet. Much of the hidden information resides in dynamic databases where one must go directly to the site and execute a query. The contents of these sites will never be indexed by any Web crawler because the pages are generated on the fly and then discarded.

This ASLIB Know How Guide explains what the invisible Internet is, why it poses a problem to Internet users and how to make use of it. The visible Web consists of those Web sites that have been identified and indexed by search engines. The invisible Web is everything else. Apart from dynamic databases there are sites where only the homepage can be indexed, or the search engine’s maximum number of pages is exceeded, or it only follows links and there are no links to that site. Other problems include pages displaying real‐time or constantly updated information such as news that the Web crawlers cannot keep up with, and “gated” sites need some form of user registration before the Web crawler can access the pages. Despite this, the invisible Web is a valuable resource: over 95 per cent of information in the invisible Web is free of charge.

This book is very practical. For example, it offers a comprehensive collection of specialist search tools to explore areas hidden within health, education, agriculture, government and many more fields. This is supported by an annotated list of selected hidden Web resources. It finishes with step‐by‐step examples of how the invisible Web can be mined to answer sample queries, either by using specialist search engines or by improving your searching on standard search engines, for example by finding pdf files on Google.

This is an excellent book, well written and comprehensive in its coverage. With approximately two million new Web pages appearing every day, more than ever users need to be able to locate high quality sources easily and quickly. This book shows one exactly how and where to look.

Related articles