Australian Library and Information Association
home > publishing > alj > 54.3 > full.text > Still lost in cyberspace? Preservation challenges of Australian internet resources
 

The Australian Library Journal

Still lost in cyberspace? Preservation challenges of Australian internet resources

Wendy Smith

Manuscript received April 2005

This is a refereed article

Introduction

In its relatively short life, the web has revolutionised the way information is produced, disseminated and used. Librarians and others charged with the preservation of information are, for the most part, moving only slowly to capture, and provide long-term access to, materials published online.

The effect of the web on late twentieth and twenty-first century life has been compared with that of the invention of printing with movable type in fifteenth-century Europe. A major difference, however, is that the survival of material on the web depends on early discovery, intervention and capture into a dedicated digital archiving system. In Australia, the PANDORA Archive provides a mechanism for selective preservation of web-based publications. The Internet Archive takes a broader, global approach to archiving the web. Both archives commenced their operations in 1996, adopting quite different selection and capture strategies.

Only a few longitudinal studies of the survival of online information have been undertaken worldwide. Many of these have concentrated on the stability of the Universal Resource Locators (URL) of web pages and what this means for the ongoing validity of the citation of web materials in scholarly publication. Others have concentrated on the stability of web content and its impact on scholarship. This paper describes an approach to determining what web-based information is surviving in a wider publishing context. It is based on a survey of a small database of Australian publications that were available on the web in late 1995. These publications were listed by the National Library of Australia (NLA) as part of its Project to Improve End-User Access to Internet Resources and their persistence was monitored from 1996 to 1998 (Smith 1998a). The results of that monitoring have now been updated to the end of 2004 and are reported in this paper.

The web

Since it became publicly available in 1993, the web has become the resource of first choice for many information seekers - from academic research to general day-to-day enquiries about goods and services. Its growth has been very rapid in terms of the amount of material published on it, its geographic reach, and the number of users. The social, political and economic consequences are well documented, as is its essentially ephemeral nature.

Unlike the publication and distribution of print-based media, there are very few barriers to creating and publishing materials on the web. The ease with which anyone can publish on the web has resulted not only in the steady increase of materials similar to traditional published materials - serials, newspapers, books, diaries - but also in the emergence of a range of materials which have no print equivalents.

Early users of the internet were mostly based in research or academic establishments. The web opened up this new technology for the communication and dissemination of information to the whole world. Australia was an early adopter of the technology; the Australian Vice-Chancellors Committee (AVCC) hosted an academic network (AARNET) from the late 1980s. The NLA was a member of this network, establishing a web server in 1995 which hosted a number of Australian Commonwealth Government department websites as well as its own web presence. Notable sites such as Coombsweb at the Australian National University pioneered publishing on the web. When it was launched in 1994, Coombsweb was the 880th web server in the world (Ciolek 1996). Through the early and mid-1990s, Australia's uptake of web technology was rapid. Australia was second behind the United States in terms of absolute numbers of host sites up to 1993. Australia continues to have one of the world's highest rates of internet penetration; Internet World Stats (2004) ranks it sixth at 65.9 per cent of the population using the internet, behind Sweden, first at 74.6 per cent, Hong Kong, United States, Iceland, and the Netherlands.

Initially the web was relatively easy to navigate. The contents of each page could be scrolled through to read or link to the full contents of a website. Developments in technology and design have enabled the generation of web pages from databases on demand, in response to an individual's or a programmed query, thus creating a new layer, the deep web. Most of the deep web is not reached by the common search engines or by any of the harvesting software currently in use. Estimates of the size of the deep web indicate that it is very much larger than the surface web.

Maintaining long-term access to the web

The size and growth-rate of the web are but two factors in providing long-term access to its contents. Other well-documented factors that act against the survival of online information include: instability of web hosts and internet service providers (ISPs); instability of URLs; changeability of information; lack of standardisation in multimedia and web presentation software; and the realities of deep web. Some of these are not new and were canvassed in an earlier paper (Smith 1998a); others have emerged in the intervening six years.

Doubts about the likelihood of retaining access to web content for any length of time have been expressed since the web began. These have been based on the volatility of web hosts, and on the stability of the organisations or individuals mounting and maintaining information on the web. Lesk suggested that the average life of a URL was forty-four days, and that 'the web is perhaps more like a newsstand than a library' (Lesk 1996). Since that 1996 estimate, many other estimates of the life-span of web information have been made. The Internet Archive FAQs suggest that 'the average life of a web page is only 77 days' (Internet Archive 2000?b). Whether these estimates are correct is hard to determine. What is true is that material published on the web is far less stable in content and location than the collections in library systems that predate the web. Web information is continually lost, mislaid or altered. For most web publishers, currency of information is the most important consideration. Information may be overwritten, added to, or removed altogether. Although some publishers keep 'archived' information, it is not always retrievable on their public websites - an example is publishers of online newspapers.

This has serious consequences for scholarship. Unless preserved and authentic web pages are available for future use, the value of research and scholarly publishing based on web resources is compromised. Scholarship relies on the accurate citation of references and the ability to verify the information content represented by those citations. Web stability is essential for preserving resources intact and with the degree of integrity required for scholarly use, both in terms of the essentials for locating the information source (the URL) and of the persistence and immutability of the content of the websites. Several case studies examine what is being lost and describe strategies to make web citations more durable (eg Casserley and Byrd 2003, Germaine 2000, Rumsey 2002, Nelson and Allen 2002, Taylor and Hudson 2000, Arms 1999, Koehler 2002 and 2004). However these studies have used quite different methods and have examined very different sets and numbers of resources over varying periods of time, which makes overall direct comparison of results difficult. Researchers have attempted to identify the most stable domains, but again this varies with the type of information being monitored. Koehler, in two of the longest-lasting studies on web persistence, (Koehler 2002, 2004) has examined a number of sites from 1996 to 2004, using a measurement of 'half-life' (the time that it takes for 50 per cent of sample URLs to disappear or change) as an indicator of stability. Most studies listed in Koehler's papers give half-lives of around 1.5 to 4.5 years. Unlike radioactive half-life, it does not appear to follow a regular and exponential decay pattern.

Although there are some notable exceptions, those individuals and organisations charged with preserving information for the long term have not fully grasped the realities of capturing and providing long-term access to web-based information. Efforts to capture and archive the web are still not fully developed, even though more than ten years have passed since its first acknowledged impact on information production and dissemination. The Internet Archive is currently the most comprehensive archive, yet it only reaches the surface, not the deep web. Individual national efforts, generally administered by national libraries, are either highly selective (such as the NLA's PANDORA Archive), or, at best, snapshots in time. Many countries, such as the United States of America and the United Kingdom, are still developing policies and procedures for dealing with this issue. The Library of Congress, through its Minerva project, has captured some themed collections including the US Elections of 2000 and 2002 and September 11, 2001. It has recently announced, through the National Digital Information Infrastructure and Preservation Program (NDIIPP), eight projects within a 'national digital preservation infrastructure' since 'Millions of digital materials, such as websites mounted in the early days of the Internet, are already lost - either completely or in their original versions' (NDIIPP 2004).

Australian website survival: 1995-2004

In late 1995, as part of the Project to Improve End-User Access to Internet Resources, librarians at the NLA identified fifty web publications that they considered significant and worthy of long-term preservation and stored information about them in a 'mini-database'. Each site was listed, its content described briefly and assigned subject headings. A list of all records in the mini-database is given in Appendix 1; two examples of individual records are given in Appendix 2. As the Project's website explained:

During the latter stages of the Project, the Group was asked to devise a mini-database of fifty (50) descriptive records for resources on the internet. It was considered that a wider rather than narrower Australian subject [emphasis] would be preferable for this purpose ... The test site which has been created is by no means ideal - by virtue of the time and resources available to us (Ryan 2003).

In the event, fifty-three URLs were stored, because three sites were located in three places on the web. All URLs were correct and accessible as at 3 January 1996. The fifty sites were monitored over a two-year period 1996/97 (Smith 1998a). These sites were again examined in January and November 2004, nearly nine years after their first selection. For those sites that could not be located at the original URL, the steps listed below were taken to find them.

  • The original listing from 1996/7 was checked for typographic or other errors.
  • The original URL was truncated to locate the site by navigating from higher levels of the site or through site maps and site search options.
  • Google was used, with the country restriction of '.au' where appropriate, and the first 20 or so hits were examined.
  • The search process was repeated after a few days if they were not found during the first search attempt.

The results of the 1996 and 2004 investigations are shown in Figure 1 below.

....ADD IMAGE....

A 'findable' URL is one that can be found somewhere online and leads to content that is still representative of the information (or a cumulation and/or update) originally described in January 1996. It may be either the original URL or a URL that has been found by the strategy described above.

Figure 1: Survival rates for websites in the NLA mini-database 1996-2004

Only 22 per cent of sites are still at their original URL. The half-life for original URLs is just under four years. A relatively high percentage of sites (64 per cent) can still be accessed 'live' on the web, nine years after they were first identified and described, although the current versions are not necessarily the same as the original ones, either in content or in style. By extrapolation, the half-life of findable sites is around fourteen years. A small percentage still have the original 'look and feel' and content, particularly some of the government documents on the NLA site, and on law-related sites (those that are unaffected by updates to legislation). Some sites are obviously different, for example, the three major newspapers' sites, whose style, content and method of access have changed considerably.

Are these typical websites? At the time they were identified, in the very early days of the web, they probably were. They included academic, government and a few private sites. Given that until the mid-1990s, provision of internet services in Australia was almost exclusively the province of AARNet, this distribution is not surprising. Non-government and independent research sites were often hosted at an academic IP address, because there were very few commercial ISPs in 1995.

Top-level domains were determined for the fifty-three URLs on the original list and for those sites located in 2004 and are shown in the table below. The percentage of sites hosted by government agencies remains relatively constant, although the percentage hosted by the NLA has dropped, as government sites that were hosted by the NLA for government agencies in 1995 are now hosted on their own servers. The percentages of academic sites and commercial sites have reversed (representing a return to a 'core business' approach on the part of Australian universities and a reduction in 'experimental' sites). The nine academic sites that have persisted, however, have been very stable, having the same hosts in 2004 as they did in 1996.

Top-level domain 1996 2004
Government
National Library of Australia
Government (excluding NLA)
20 (38%)
9 (17%)
11 (27%)
13 (36%)
5 (14%)
8 (22%)
Academic 18 (34%) 9 (25%)
Commercial 12 (23%) 12 (33%)
Organisation 2 (4%) 2 (6%)
Unknown 1 (2%) 0
Gone 0 17
Total 53 53

Figure 2: Changes in top-level domain of the fifty-three web locations

The distribution of domains in the mini-database is very different from the overall composition of the web. Lawrence and Giles (1999) estimated that 83 per cent of servers contain commercial content; the other 17 per cent is distributed, with science and education around 6 per cent, health, personal, society and community content around 2 per cent each, and government, pornography and religion around 1 per cent. The top-level domain distribution in the mini-database is similar to that of the PANDORA Archive, which at November 2004 comprised government (41.7 per cent), academic (12.5 per cent), commercial (33.8 per cent), and organisations (12.1 per cent).

Where are the mini-database websites preserved?

The NLA's PANDORA project was among the earliest of the world's web archiving projects. It commenced as a proof-of-concept archive in 1996, based on the intellectual framework of Library and Archives Canada's selective web archiving project. Behind PANDORA was the assumption that comprehensive archiving could become unwieldy and unmanageable, and that it was better to put resources into selection and authentication rather than into an all-of-the-web grab-bag. The capture of material for PANDORA continues to be quality controlled to ensure that it is archived correctly and with the permission of publishers. PANDORA captures Australian publications, attempting 'to create the online electronic equivalent of a legal deposit library for Australian publications' (Smith 1998b: 63-75). Most of Australia's state libraries and several special libraries are now in partnership with the NLA in building the PANDORA Archive. They select for archiving according to their own selection guidelines, which are more or less based on those of the NLA. Although from time to time expansion of PANDORA's scope has been suggested (for example, Gatenby 2002), this has not eventuated. PANDORA's narrow collection development policy remained virtually unchanged from 1996 until mid-2003 when a realignment was announced (National Library of Australia 2003) which, in effect, restricted its collecting still further. PANDORA has not been responsive to changes in web publishing styles. It collects 'print-like' publications, although on the whole it excludes material that has print equivalents; it specifically excludes more innovative forms of publishing such as blogs. By the end of 2004 just over 7000 titles had been archived altogether.

Although PANDORA would seem to be the logical place to locate significant Australian digital publications, the narrow focus of its selection policy excludes many general materials. The effectiveness of PANDORA in preserving a subject-specific set of websites relating to the Australian wine industry is noted elsewhere (Smith, W 2004). Of a sample of more than fifty Australian wine sites, none were preserved in PANDORA, but all had some presence in the Internet Archive. Of a total of over 1200 Australian winery websites, only three are included in PANDORA.

In revisiting the websites listed in the 1995 mini-database, this paper examines the fate of a set of sites which are not subject-specific, but which were, nevertheless, identified as being significant Australian sites. The mini-database predates the PANDORA project, although their selection foreshadows the general spirit of the PANDORA selection guidelines. It is almost a proto-PANDORA, although refinements to selection criteria for PANDORA excluded sites with print equivalents, index sites and some of the other sites in the database. Of the fifty-three web locations, only five (10 per cent) have been selected for preservation in PANDORA, while seventeen are no longer available on the web. On the other hand, 82 per cent have a presence in the Internet Archive, which was just being established when the mini-database websites were first examined in November 1996. Three of these sites have access to them blocked. Six of the mini-database sites disappeared from the web before the Archive commenced its collecting. One of these is archived in PANDORA. Figure 3 shows the five selected by PANDORA and their current status.

  Selected by Status PANDORA
date range
Internet Archive IA date range
18/50
Ausflag
NLA Continuing 1997+ Yes 1996+
20/50
Australian Republican Movement
NLA Continuing 1998+ Yes 1996+
34/50
National Australia Day Council
NLA Continuing 2000+ Yes 1998+
46/50
OzLit
State Library
of Victoria
Not maintained 1999-2001(?) Yes 1997-2001
47/50
The Australian Observer
State Library of
New South Wales
Gone 1995-1996 (SLNSW) Yes;
last issue only
1996

Figure 3: Mini-database websites selected for PANDORA

Of the five sites in PANDORA, three are continuing and are still being updated. Ozlit has ceased to be maintained. Ozlit and the three continuing sites are held in the Internet Archive, which has copies that predate the PANDORA copies by one or two years. The Australian Observer is an anomaly. The NLA selected it for inclusion in PANDORA but, before it could be captured, it ceased to exist, although its final issue remained online for some time. This final issue is the only one in the Internet Archive. The PANDORA Archive has a full set selected and archived by the State Library of New South Wales some time after it became a partner in PANDORA.

Of the fifty titles, it is possible to identify a further group of perhaps fifteen to twenty as candidates for inclusion in PANDORA. Arguments can be made for and against the inclusion of almost any of the sites. Deciding the selection criteria in the early days of PANDORA was a committee process, which usually required a majority decision. This led eventually to the formulation of a set of rules. However, in the decade since those rules were developed the nature of the web has evolved and the patterns of its use have changed, signalling a need to extend the selection criteria. For instance, will future generations of researchers consider themselves well-served when they are presented with a monochrome, static microfilm of The Australian newspaper from the late twentieth century rather than being able to access a full-colour, interactive and searchable copy preserved in some digital archive?

The limited selection of these mini-database websites made by PANDORA is, in part, a result of the decision to commit resources to high quality control standards in the archiving process to ensure that each archived site was as true a replica of the site at the point of archiving as possible. In the early days of the web, when the amount of material being published on it was quite small, this decision still meant a reasonable coverage of the web by the archive; with the exponential increase in web publishing, a much smaller proportion of the web is now being captured.

The Internet Archive takes a much more comprehensive approach to archiving websites than PANDORA. Established by Brewster Kahle in 1996, it has been compiled from regular crawls of the web since then. Pages that are password protected and on secure servers are not captured, nor are those pages whose owners initiate their exclusion. Kahle acknowledged early that the Internet Archive 'will never be comprehensive, because the crawler software cannot gain access to many of the hundreds of thousands of sites... Still, the archive gives a feel of what the web looks like during a given period of time even though it does not constitute a full record' (Kahle 1997: 82). The purpose of the Internet Archive is to offer 'permanent access for researchers, historians, and scholars to historical collections that exist in digital format' (Internet Archive 2000?a). One of Kahle's reasons for starting the Internet Archive was that 'by 1996 there was enough material on the internet to show that this thing was the cornerstone for how people are going to be publishing. It is the people's library' (Rein 2004). Its long-term future remains an issue, given its private and relatively non-aligned status. However, in the past few years, the Internet Archive has gained increasing recognition as a major source of historic web archiving and is increasingly taking part in collaborative international projects. The Internet Archive is the only truly global web archive in existence, although it does not have all the attributes of a trusted repository.

In the Internet Archive no quality control is carried out, which can result in the absence of some of the original detail of the preserved sites. This is particularly the case for sites in non-standard mark-up languages. Internet Archive pages are captured 'as is' and remain unchanged. The rationale behind the Internet Archive is that it is better to attempt something with existing resources rather than to delay until perfection can be achieved. Vindication of that approach is the existence of a full record of web publishing from 1996, with enough examples of information content and 'look and feel' to analyse changes in publishing style since 1996. For many websites, the Internet Archive copy is the only record of a previous edition or the only complete run of a serial.

On the limited evidence of the analysis of two groups of Australian websites it would appear that few sites cannot be found in the Internet Archive. The inclusion of forty-one of the mini-database websites in the Internet Archive shows that, at present, it provides a much broader coverage of Australian web publication than PANDORA provides.

Conclusion

After nearly ten years, a large proportion of the mini-database of titles selected as having some significance to Australia can still be found online, even though they are not at the original URL. However, accessibility continues to drop, and, in another five years or so, only 50 per cent of the titles are likely to be found online. The Internet Archive provides the only mechanism for ongoing access to the content of the majority of these titles.

PANDORA continues to emphasise 'print-like' objects, without capturing objects already in print format, at the risk of failing to capture the full range of uses that the web now has in communication and dissemination of information of all sorts, and failing to accommodate the evolutionary, increasingly database-driven, steps of the web (as identified by Bruns (2004)).

The new genres that emerge from the web's evolution 'demand some form of stewardship' (Smith, A 2004: 4). PANDORA's decision to exclude blogs is an indication of the difficulty of developing new models of stewardship in a profession as deeply conservative and as driven by profound idealism as librarianship. It reinforces concerns expressed in Seamus Ross's contention, much earlier in the web's evolution, that:

A certain narrow-mindedness has pervaded studies of electronic information, as the focus has been predominantly by national archives on the preservation of records about national governments themselves. More attention needs to be focussed on other records or information resources that document our culture and on a range of other institutions that produce them. Especially important are records that will allow us to give life to the many stories history can tell (Ross 1998: 23).

The audience for any digital preservation project is not easy to determine. PANDORA has taken a conservative approach and the material saved using this approach is 'quality'. Current trends in historical research and publishing indicate that this is not necessarily what future generations will want. Such informal documents as diaries of 'the common man', printed ephemera, family photograph collections, and those ephemeral publications - newspapers - are increasingly used as sources for historical scholarship, rather than biographies of important persons and the formal records of important events. Although the survival rate of Australian web publishing may improve in the future, for the present and the recent past the most comprehensive source of what remains of our online heritage is the Internet Archive. Until its future is guaranteed, and while our national archive continues to be constrained by its policies and resources, Australia will continue to lose the web component of its documentary heritage, ten years of which is already effectively lost.

References

Arms, W Y (1999) 'Preservation of scientific serials: three current examples' Journal of Electronic Publishing 5(2), http://www.press.umich.edu/jep/05-02/arms.html (viewed 14 February 2005).

Bruns, A (2004) 'Contemporary culture and the web' paper presented at Archiving Web Resources: Issues for cultural heritage institutions, 9-11 November, http://www.nla.gov.au/webarchiving/BrunsAxel.ppt (viewed 11 February 2005).

Casserley, M and Byrd, J (2003) 'Web citation availability: analysis and implications for scholarship' College and Research Libraries 64(4): 300-317.

Ciolek, M T (ed.)(1996) Coombsweb: ANU Social Sciences Server, Australian National University, Coombs Computing Unit, http://web.archive.org/web/19961221072448/coombs.anu.edu.au/ (viewed 14 February 2005).

Gatenby, P (2002) 'Digital continuity: the role of the National Library of Australia' Australian Library Journal 51(1): 21-30, http://www.alia.org.au/publishing/alj/51.1/full.text/digital.continuity.html (viewed 10 February 2005).

Germaine, C A (2000) 'URLs - uniform resource locators or unreliable resource locators' College and Research Libraries 61(4): 359-365.

Internet Archive (2000?a) About the Internet Archive, http://www.archive.org/about/about.php (viewed 11 February 2005).

Internet Archive (2000?b) Frequently Asked Questions, http://www.archive.org/about/faqs.php (viewed 14 February 2005).

Internet World Stats (2004) Top Ten Countries in Internet with the Highest Penetration Rate (as at 16 December). http://www.internetworldstats.com/top10.htm#pop (viewed 10 February 2005).

Kahle, B (1997) 'Preserving the internet' Scientific American 276(3): 82.

Koehler, W (2002) 'Web page change and persistence - a four-year longitudinal study' Journal of the American Society for Information Science and Technology 53(2): 162-171.

Koehler, W (2004) 'A longitudinal study of web pages continued: a consideration of document persistence' Information Research 9(2), http://informationr.net/ir/9-2/paper174.html (viewed 19 March 2004).

Lawrence, S and Giles, C L (1999) 'Accessibility of information on the web' Nature 400: 107-109.

Lesk, M (1996) Mad Library Disease: Holes in the Stacks. Lazerow Lecture, given at University of California Los Angeles, 18 April, http://lesk.com/mlesk/ucla/ucla.html (viewed 14 February 2005).

NDIIP (2004) Library of Congress announces awards of $15 million to begin building a network of partners for digital preservation. Press release, 30 September, http://www.digitalpreservation.gov/about/pr_093004.html (viewed 10 February 2005).

Nelson, M L and Allen, B D (2002) 'Object persistence and availability in digital libraries' D-Lib Magazine 8(1), http://www.dlib.org/dlib/january02/nelson/01nelson.html (viewed 24 February 2005).

National Library of Australia (2003) Collecting Australian Online Publications, http://pandora.nla.gov.au/bsc49.doc (viewed 10 February 2005).

Rein, L (2004) 'Brewster Kahle on the Internet Archive and people's library' Open p2p.com, http://www.openp2p.com/pub/a/p2p/2004/01/22/kahle.html (viewed 11 February 2005).

Ross, S (1998) 'The expanding world of electronic information and the past's future' in E Higgs (ed.) Historians and Electronic Artefacts Oxford University Press, Oxford.

Rumsey, M (2002) 'Runaway train: problem of permanence, accessibility and stability in the use of web sources in law review citations' Law Library Journal 94(1): 27-39.

Ryan, E (2003) E-mail and attachment (mini.html) to the author, 24 June.

Smith, A (2004) 'The future of web resources: who decides what gets saved and how do they do it?' Paper presented at Archiving Web Resources: Issues for Cultural Heritage Institutions, 9-11 November, http://www.nla.gov.au/webarchiving/SmithAbby.rtf (viewed 11 February 2005).

Smith, W (1998a) 'Lost in cyberspace: preservation challenges of Australian Internet resources' LASIE 29(2): 6-25, http://www.sl.nsw.gov.au/lasie/jun98/jun1998.pdf (viewed 23 February 2005).

Smith, W (1998b) 'PANDORA: Providing long-term access to Australia's online electronic publications' Alexandria 10(1): 63-75.

Smith, W (2004) 'Wine on the web: Australian wine information on the web and its prospects for long-term preservation and access' AARL 35(2): 111-128.

Taylor, M K and Hudson, D (2000) '"Linkrot" and the usefulness of website bibliographies' Reference and User Services Quarterly 39(3): 273-276.


Biographical information

Wendy Smith was inaugural director of the National Library of Australia's PANDORA project. She has worked in materials conservation and information management in the academic and public sectors in Australia and overseas. She is completing a PhD on digital preservation at Charles Sturt University.

Appendix 1

Mini-database listing

No. Title URL at 1/1996
1/50 Australian Government Entry Point (Canberra, Australia) http://www.nla.gov.au/oz/gov/
2/50 Centre for Australian Studies in Wales (Lampeter, Cymru) http://www.lamp.ac.uk/oz/
3/50 Republican Debate (Melbourne, Australia) http://www.monash.edu.au/ncas/onlinecontents.htm
4/50 How to be Australian (Melbourne, Australia) NB actual article listed had a different name - Remove the Queen... By Sir Harry Gibb http://www.monash.edu.au/ncas/howtobeoz.htm
5/50 Threatened Fauna in Australia: A Select Bibliography (South Melbourne, Australia) http://mac-ra26.sci.deakin.edu.au/fauna.html
6/50 House of Representatives Internet Law Library Australia (Washington, USA) http://www.pls.com:8001/his/53.htm
7/50 House of Representatives (Canberra, Australia) http://library.aph.gov.au/house/
8/50 Commonwealth Budget Papers (Canberra, Australia) http://gov.info.au/budget95/budget95.html and
http://www.nla.gov.au/finance/budget95/budget95.html (file:///C:/fin...)
9/50 National Conservation and Preservation Policy for Movable Cultural Heritage (Canberra, Australia) http://www.nla.gov.au/3/npo/natco/natpol.html (file:///C:/3/npo.....)
10/50 Commerce In Content: Building Australia's International Future In Interactive Multimedia Markets (Canberra, Australia) Culter (sic) Report http://www.nla.gov.au/misc/cutler/cutlercp.html (file:///C:/misc.....)
11/50 Grandkids Australia (Brisbane, Australia) http://www.sofcom.com.au/Grandkids/
12/50 Statement of Principles for Media Reporting on Aboriginal and Torres Strait Islander Issues 1994 (Canberra, Australia) http://www.dca.gov.au/atsi.htm
13/50 Northern Territory Government (Darwin, Australia) http://www.nt.gov.au/
14/50 City of Bits: Space, Place, and the Infobahn (Cambridge, USA) http://www-mitpress.mit.edu/City_of_Bits/welcome.html
15/50 Australia Online (New York, USA) http://www.australia-online.com/
16/50 Documents on the Australian Republic (Canberra, Australia) http://www.nla.gov.au/pmc/republic.html (file:///C:/pmc.....)
17/50 LC MARVEL (Washington, USA) Library of Congress (LC) Machine-Assisted Realization of the Virtual Electronic Library (MARVEL) gopher://marvel.loc.gov/
18/50 Ausflag (Sydney, Australia) http://www.ausflag.com.au/
19/50 Commonwealth Consolidated Acts (Sydney, Australia) http://austlii.law.uts.edu.au/au/legis/cth/consol/act/ (probably a typo)
20/50 Australian Republican Movement (Sydney, Australia) http://www.mpce.mq.edu.au/~brendan/arm.html
21/50 Australian Media Facilities Directory (Sydney, Australia) http://www.amfd.com.au/
22/50 Australian Libraries (Canberra, Australia) http://info.anu.edu.au/ozlib/ozlib.html
23/50 aussie.index (Sydney, Australia) http://www.aussie.com.au/
24/50 National Library of Australia (Canberra, Australia) http://www.nla.gov.au/ (file:///C:/)
25/50 Political Science in Australia (Canberra, Australia) http://online.anu.edu.au/polsci/austpol/
26/50 Genealogy in Australia (Canberra, Australia) http://www.pcug.org.au/~mpahlow/welcome.html
27/50 Commonwealth of Australia Constitution (Canberra, Australia) http://ccadfa.cc.adfa.oz.au/~adm/constitution/con.html
http://www.nla.gov.au/pmc/constitu.html (file:///C:/pmc....)
28/50 Australian New Zealand Studies Centre (Pensylvania, USA) http://www.psu.edu/research/anzsc/
http://www.psu.edu/research/anzsc/alt.html
29/50 Spectrum Management Agency (Canberra, Australia) http://www.sma.gov.au/
30/50 The Age: an Information Service of The Age Newspaper (Melbourne, Australia) http://www.theage.com.au/
31/50 The Sydney Morning Herald (Sydney, Australia) http://www.smh.com.au/
32/50 The Australian Financial Review (Sydney, Australia) http://www.afr.com.au/
33/50 Prime Minister, Paul Keating's Home Page (Canberra, Australia) http://www.nla.gov.au/pmc/pjkhome.html (file:///C:/pmc.....)
34/50 National Australia Day Council (Sydney, Australia) http://www.telstra.com.au/nadc/
35/50 Innovate Australia (Canberra, Australia) Innovation Statement http://www.dist.gov.au/events/innovate/index.html
36/50 OzKidz Literature (Ipswich, Australia) http://gil.ipswichcity.qld.gov.au/ozkidz/ozlit.html
37/50 Australian Information on the Internet (Ipswich, Australia) http://gil.ipswichcity.qld.gov.au/elib/ausint.html
38/50 Australian Gopher Servers (Canberra, Australia) gopher://info.anu.edu.au/11/OtherSites/au
39/50 OZLISTS: List of Australian Electronic Mailing Lists (Qld, Australia) http://www.gu.edu.au/gint/ozlists/ozlists_home.html
40/50 Australian Embassy Washington DC WWW (Washington, USA) http://www.aust.emb.nw.dc.us/
41/50 Creative Nation (Canberra, Australia) http://www.dca.gov.au/creative_nation/contents.htm
http://www.nla.gov.au/creative.nation/contents.html
42/50 Australian Department of Foreign Affairs and Trade Media Releases (Canberra, Australia) http://www.dpie.gov.au/dfat/pmb/releases/department/deptmnu.html
43/50 Australian Postcodes (Canberra, Australia) gopher://ccadfa.cc.adfa.oz.au/11/Gateway%20Services/Australian%20Postcodes
44/50 FactSheet Five Electric (San Francisco, USA) http://kzsu.stanford.edu/uwi/f5e/f5e.html
45/50 Australian A-Z Animal Archive (?, Australia) http://www.com.au/aaa/A_Z/Home.html
46/50 OzLit (East Oakleigh, Australia) http://ipax.apana.org.au/~itisus/index.html
47/50 The Australian Observer (Petersham, Australia) NB full text access only to subscribers?? http://www.ozemail.com.au:80/observer/
48/50 Commonwealth Consolidated Regulations (Sydney, Australia) http://www.austlii.edu.au/au/legis/cth/consol_reg/
49/50 Asian Studies WWW Virtual Library (Canberra, Australia) http://coombs.anu.edu.au/WWWVL-AsianStudies.html
50/50 Aboriginal Studies WWW Virtual Library (Canberra, Australia) http://coombs.anu.edu.au/WWWVL-Aboriginal.html

Appendix 2

Sample records in mini-database
Record 1/50, Record revised 1 Jan 1996

Author/originator National Library of Australia
Title Australian Governments' Entry Point (Canberra, Australia)
Point of contact National Library of Australia, Parkes Place, Canberra, ACT 2600, Australia Telephone +61 6 2621111. Facsimile +61 6 2571703 e-mail k.webb@nla.gov.au
Summary Directory of Australian federal and state government information on the internet, including current topical issues.
Language In English
Location http://www.nla.gov.au/oz/gov/
Access protocol http
Domain gov.au
Access conditions None
Subject headings Australia - Politics and government
Dewey 919.4
Control # -
Host www.nla.gov.au
Country code AU

Record 2/50, Record revised 1 Jan 1996

Author/originator Centre for Australian Studies in Wales /Canolfan Astudiaethau Awstralaidd Cymru
Title Centre for Australian Studies in Wales (Lampeter, Cymru)
Point of contact Centre for Australian Studies in Wales, University of Wales, Lampeter, Wales, SA48 7ED, UK. Telephone +44 0 1570 423423. Facsimilie +44 0 1570 423423, e-mail sumner@lamp.ac.uk
Summary Home page for the Centre which aims to co-ordinate research on Australia, in particular Australian links with Wales. Links to Wales and Australia.
Language In English with some Welsh
Location http://www.lamp.ac.uk/oz/
Access protocol http
Domain ac.uk
Access conditions None
Subject headings Australian studies
Dewey 994.005 20
Control # -
Host www.lamp.ac.uk
Country code CYM

top
ALIA logo http://www.alia.org.au/publishing/alj/54.3/full.text/smith.html
© ALIA [ Feedback | site map | privacy ] pp.sc 8:54am 3 August 2010