Australian Library and Information Association
home > groups > topend > 2003.symposium > Learn, unlearn, relearn
 

The first ALIA Top End Symposium: Powering our Territory

Invisible web

Julie Adams

What is the invisible web?

  • = 'deep web' = 'hidden web'
  • Invisible to search engines
  • How search engines work
  • *The world wide web (8 billion)*
  • Size does matter*

Why

  • Web is huge
  • Constantly changing
  • Cost

Need alternative strategies to search engines

  • Think human *
  • Use invisible web directories
  • Collect URLs
  • Use favourites
  • Go to the experts/source
  • Search for portals

Crawlers suck

Probability of a crawler locating a web page = 40 percent

  • Need alternative strategies to search engines
    Non-HTML file formats, originally designed for HTML text, .pdf, .doc, .jpg, .mp3
  • Solution
    Search for specific formats, use format specific search engines
  • Dynamically generated pages
    Spider traps, storage intensive, can't type
  • Solution
    Locate the source
  • Password protected sites
    Need passwords, can't type
  • Solution
    Locate the source, use libraries, register
  • Large websites
    Crawler does not go deep
  • Solution
    Use site specific search engines, search for 'database', hunt - not gather

prevtop
ALIA logo http://www.alia.org.au/groups/topend/2003.symposium/invisible.web.html
© ALIA [ Feedback | site map | privacy ] ja.it 11:49pm 1 March 2010