Internet Archive Frequently Asked Questions

Frequently Asked Questions
[ About the Movies \| About the Prelinger Movies \| Audio \| Contributing to the Archive \| Downloading and Playing Movies \| Forums \| Software & CD-ROM Archive \| Texts and Books \| The Internet Archive \| The Wayback Machine \| Virtual Library Cards (AKA Accounts) ]

Questions

Who owns the rights to these movies?

Are there restrictions on the use of the Prelinger movies?

Can you point me to resources on the history of ephemeral films?

Is there a discussion list about the movies?

Are there other similar archives on the Web?

Why are there no post-1964 movies in the Prelinger collection?

Why does this site contain only movies produced in the United States?

What are those animations associated with each movie and how did you make them?

About the Movies

Who owns the rights to these movies?

Each collection has come from some donor and may impose some restrictions on use and re-use. We are endevouring to make it easy to understand what you can do with these movies, but this is a work-in-progress.

Are there restrictions on the use of the Prelinger movies?

The Prelinger movies are open and available to everyone without charges or fees. You are warmly encouraged to access, download, use, and reproduce these films in whole or part, in any medium or market throughout the world, for any purpose whatsoever except the following:

You may not sell or sell access to the data files on this site, in whole or in part. You may give or transfer them to any other person or company, but the gift or transfer must be free of charge.
You may not sell, represent, license, or charge for access to these films as stock footage.
You may not convert these files into other online distribution formats, except for open-source MPEG-4 formats. Please contact Prelinger Archives if you wish us to consider other alternatives.

Can you point me to resources on the history of ephemeral films?

See the bibliography and links to other resources at www.prelinger.com/ephemeral.html.

Is there a discussion list about the movies?

Yes — our list is about both movie content and technical issues. You can subscribe at moviearchive-subscribe@yahoogroups.com.

Are there other similar archives on the Web?

As far as we know, this is the only site that presents high-quality downloadable movie data files practically free of use restrictions. See the Links page at Prelinger Archives for a number of sites that may be useful to researchers or those seeking specific films or footage.

Why are there no post-1964 movies in the Prelinger collection?

Because of copyright law. While a high percentage of ephemeral films were never originally copyrighted or (if initially copyrighted) never had their copyrights properly renewed, copyright laws still protect most moving image works produced in the United States from 1964 to the present. Since this site exists to supply material to users without most rights restrictions, every title has been checked for copyright status. Those titles that either are copyrighted or whose status is in question have not been made available. For information on recent changes in copyright law, see the circular Duration of Copyright (in PDF format) published by the Library of Congress.

Why does this site contain only movies produced in the United States?

Again, the reason is copyright law. A great many ephemeral films produced in the United States are not currently protected by copyright, either because their original copyrights have expired without renewal or because they were not properly copyrighted before publication (for example, published without copyright notice in proper form). Films produced in most other nations enjoy a greater degree of copyright protection and, for the most part, could not be placed on this site without the permission of the copyright owners and other stakeholders.

What are those animations associated with each movie and how did you make them?

The animations on the details pages and on the browse pages are animated GIF files. In most cases, still shots from each minute of the program were grabbed and saved as JPG files (these are the thumbnails which you can reach by clicking on the "See movie scenes" links). Then a tool called ImageMagick was used to create the animated GIF files from the JPGs.

Questions

Do I need to credit the Internet Archive and Prelinger Archives when I reuse these movies?

Do I need to inform the Internet Archive and/or Prelinger Archives when I reuse these movies?

How can I get access to these movies on videotape or film?

About the Prelinger Movies

Do I need to credit the Internet Archive and Prelinger Archives when I reuse these movies?

We ask that you credit us as a source of archival material, in order to help make others aware of this site. We suggest the following forms of credit:

Archival footage supplied by the Internet Moving Images Archive (at archive.org) in association with Prelinger Archives

Archival footage supplied by the Internet Moving Images Archive (at archive.org)

"Archival footage supplied by archive.org"

Do I need to inform the Internet Archive and/or Prelinger Archives when I reuse these movies?

No. However, we would very much like to know how you have used this material, and we'd be thrilled to see what you've made with it. This may well help us improve this site. Please consider sending us a copy of your production (postal mail only), and let us know whether we can call attention to it on the site. Our address is:

Rick Prelinger
c/o Internet Moving Pictures Archive
PO Box 29064
San Francisco, CA 94129
United States

How can I get access to these movies on videotape or film?

Access to the movies stored on this site in videotape or film form is available to commercial users through Archive Films, representing Prelinger Archives for stock footage sales. Please contact Archive Films directly:

Archive Films/Archive Photos
75 Varick Street
New York, NY 10013
United States
+1 (646) 613-4100 (voice)
+1 (646) 613-4140 (fax)
+1 (800) 876-5115 (toll free in the US)
sales@archivefilms.com

Please visit us at www.prelinger.com/prelarch.html for more information on access to these and similar films. Prelinger Archives regrets that it cannot generally provide access to movies stored on this Web site in other ways than through the site itself. We recognize that circumstances may arise when such access should be granted, and we welcome email requests. Please address them to Rick Prelinger.

The Internet Archive does not provide access to these films other than through this site.

Questions

What is the etree.org archive all about?

Can I log into an FTP server to download these concerts?

What are SHN files?

What are MD5 files?

How can I listen to SHN files?

How do I burn SHN files to CD as audio tracks?

How can I download SHN files? They just show up as weird characters in my browser.

Audio

What is the etree.org archive all about?

etree.org is a network of mailing lists and FTP servers devoted to providing public access to high quality digital recordings of live music performances. All of the concerts provided through these FTP servers are performances by musicians and bands that have taping and trading policies which permit non-commercial recording and distribution of their live concerts. Since space and bandwidth are often of concern to FTP site administrators, the digitized recordings are hosted on most servers for only a short time. After a digitized recording disappears from a server, the only means of obtaining are extraction from another media type (Digital Audio Tape, Minidisc, CD-R), a time-consuming process that can, in some cases, cause generational loss. The nature of the Internet Archive, a digital library with media of all types, provides a natural alliance with etree. With the means to archive all of these digital recordings that circulate on etree FTP servers, and to so readily have the consent and support of the musicians and the trading community, provides a unique opportunity to ensure the high-quality longevity of thousands of live concerts from the 1960's onward.

Can I log into an FTP server to download these concerts?

Yes, you can log into etree01.archive.org or etree02.archive.org with the username anonymous and use your email address as the password.

What are SHN files?

SHN stands for shorten. It is a lossless compression algorithm for digital music. It was developed by SoftSound and it compresses music files to 50-60% of their original size, with no loss in quality. See this FAQ

portions from etree faq

What are MD5 files?

MD5 files contain checksums, strings of characters used to uniquely represent a file. These checksums enable users to verify that music files downloaded correctly.

How can I listen to SHN files?

On the Macintosh: First download and install MacAmp Lite, a multi-format audio player, and then install the Shorten Plugin for MacAmp.

Using Windows: Download and install WinAmp, a multi-format audio player, and then install the ShnAmp Plugin for WinAmp.

How do I burn SHN files to CD as audio tracks?

You will first need to convert the SHN files to another format that your burning program is familiar with. Windows users can use Michael K. Weise's tool, mkwACT, to convert SHN files to WAV files, which are suitable for burning programs. For Macintosh users, Doug Hornig has created a tool appropriately titled, Shorten for Macintosh.

How can I download SHN files? They just show up as weird characters in my browser.

To download SHN files on a PC, right click the link to the file, and select "Save Target As". On the Macintosh, hold the button down while the mouse is over the link, and when the menu comes up, select "Save Target As"

Questions

How do I add books that are online to the Archive?

What do FORMAT:URL and TITLE#URL mean on the

Contributing to the Archive

How do I add books that are online to the Archive?

The Open Source Books collection exists so that users like you can add books to the Archive. There are two ways to add a book, depending on whether or not the book you want to add is already in the Archive.

1) If the book you want to add is in the Archive, you can simply go to the book's detail page (by searching for it, or by browsing through the collections) and click the link that says "Click here to add your own edition of this text." This will bring you to a form for adding your edition of the book with the fields already filled in.

2) The other option is if the book you would like to add is not yet in the Archive (or you cannot find it for some reason). To do this, all you need to do is access this form. If you're adding a book by an author who has some books in the Archive, please strive to format the author's name exactly as it appears on the details pages for those books (this allows us to maintain consistency when users browse by author).

What do FORMAT:URL and TITLE#URL mean on the

FORMAT:URL represent links to the actual books on the Internet. For example, suppose I have a copy of Alice in Wonderland stored on my web server in Adobe Acrobat format, that has the URL http://www.myserver.com/books/alice.pdf. The Archive can link to this copy of the book from the details page that will be created once you submit the form. In order to know what format the book is in, we ask that you supply the URL of the book as FORMAT:URL. In this case this would be Adobe Acrobat:http://www.myserver.com/books/alice.pdf. The details page will then display a link in the "Read Texts" section of the page that looks like this: Adobe Acrobat which points to http://www.myserver.com/books/alice.pdf.

Alice in Wonderland example, suppose I found a website called Alice's World, related to this book, located at http://www.alicesworld.com. For the details page to properly display the title, you must supply it in the Related web pages section of the form, and you have to format it as such: Alice's World#http://www.alicesworld.com. The details page will then display a link like this: Alice's World which goes to http://www.alicesworld.com.

Questions

What software can play the downloaded movies?

What other software and equipment can I use?

Why does my computer hang or give me errors when I try to download or play a movie?

Can I download movies via FTP?

Why do I get errors when I try to play a movie?

Can I use these movies in FinalCutPro -- in the Quicktime format?

Sometimes when I play a movie, the video is choppy or very pixelated. Why is that?

Why does this site only offer such high-resolution copies that can't be easily played by everyone?

How can I search for movies?

How did you digitize the films?

An article on re-coding Prelinger Archive films to SVCD so you can watch them on your DVD player.

Where can i find more information on how to play movies on the macOS?

Where can i find more information on how to play movies on other operating systems?

Is there a discussion list for technical issues?

How can I use the MPEG2 files to make my own movie?

What about streaming the movies?

Downloading and Playing Movies

What software can play the downloaded movies?

For Windows:
MPEG1 (VCD) most players;
MPEG2 (DVD) shareware player from http://www.elecard.com, or for-pay quicktime6 plugin: http://www.apple.com/quicktime/products/mpeg2playback/ ;
MPEG4 quicktime6 from www.apple.com

For Mac OSX and 9:
MPEG1 (VCD) most players;
MPEG2 (DVD) freeware VLC ( http://www.videolan.org/ ) the for-pay quicktime6 add-on (see http://www.apple.com/quicktime/products/mpeg2playback/ ).
MPEG-4 Quicktime6.

Please contact us if you have information about players.

What other software and equipment can I use?

You can try any of various players available for downloading. In addition, for better performance, you can add decoder board hardware to your computer.

PLAYERS: Try the evaluations of players at coolstf.com. Unfortunately, because computers can be set up in so many different ways and because different standards exist for playing video, finding a player that will work is a hit-and-miss process. If you have trouble playing the movies, try another player, post your question on our discussion list (moviearchive-subscribe@yahoogroups.com), or write to us at info@archive.org.

At present, besides Quicktime, we know of no other Macintosh players. See http://www.apple.com/quicktime/ for the free QT6 player for MPEG4 and the for-pay quicktime6 add-on for MPEG2 (DVD). We will update this page as players become available. Please contact us if you have information about Macintosh-compatible players or decoder boards.

HARDWARE: Using a decoder board shifts all the responsibility for decoding the video into hardware and lets you watch full-screen, full-motion video on just about any PC running Windows. Most decoder boards also include a video-out jack so that you can watch the output on a TV monitor or even record a film directly to a VCR. The Archive can't take responsibility for recommending any hardware solutions, but we've been happy with the Sigma Designs RealMagic Netstream 2000 card (for Windows machines).

At present, we know of no hardware solutions for the Macintosh. Please contact us if you have information about hardware for that platform.

Why does my computer hang or give me errors when I try to download or play a movie?

1. There is heavy traffic to our site. If you experience a delay, please try again later or at a different time of day.

2. You're behind a firewall and the firewall software is attempting to modify incoming bits. Contact your network or firewall administrator (to test, try downloading from outside the firewall first).

3. Your Internet connection went down or timed out. Check with your ISP or network administrator to see if there's a special policy about keeping a connection live.

4. If your browser seems to hang after a "100% downloaded" message, check to see that you have sufficient hard-disk and TMP disk space. Rebooting the system sometimes helps.

If you still have trouble, post your question on our discussion list (moviearchive-subscribe@yahoogroups.com) or write to us at info@archive.org.

Can I download movies via FTP?

Yes — via anonymous FTP at ftp.archive.org.

Why do I get errors when I try to play a movie?

1. You are trying to play an MPEG-2 file on a platform other than Windows or Linux. At present, you need the for-pay quicktime6 add-on to play MPEG-2 files on the Macintosh. We will update this page as players become available. Please contact us if you have information about players that work on platforms other than Windows.

2. Your player tried to stream the movie. (You may get a display of odd-looking text in the browser involving "application/octet-stream.") Try downloading the file again, but right-click the link to save the file to disk so that the player won't try to stream it. Our files will not stream.

3. Some conflict exists between your computer's configuration and the player you're using. Unfortunately, because PCs can be set up in so many different ways and because different standards exist for playing video, finding a player that will work is a hit-and-miss process. Try Rod Hewitt's evaluations of a number of players.

If you still have trouble, post your question on our discussion list (moviearchive-subscribe@yahoogroups.com) or write to us at info@archive.org.

Can I use these movies in FinalCutPro -- in the Quicktime format? You can Re-encode Mpeg2 movies to quicktime for FinalCut Pro using Cleaner5.0.2 using the following settings. There is no de-interlacing, so you don't lose anything. The files increase in size 10 fold, so make sure you have enough HD space. This procedure gives you quicktime movies suitable for use with final cut.

Cleaner 5 -- if you don't have 5.0.2, you can download.0.2 from the terran.com site.
- output > quicktime, .mov
- tracks > process everything
- image > image size constrain to 720*480, display size normal, do not deinterlace, field dominance-SHIFT DOWN
- encode > apple DV-ntsc codec, millions of colors, spatial quality 100%, frame rate, same as source
- Audio > we're still not sure about which is best. start with mono, 48kb, experiment.

Some have had good results with their decoder cards. compare a few films done both ways on a good monitor with scopes and see which method is best.

If you still have trouble, post your question on our discussion list (moviearchive-subscribe@yahoogroups.com) or write to us at info@archive.org.

Can I use these movies in FinalCutPro -- in the Quicktime format?

You can Re-encode Mpeg2 movies to quicktime for FinalCut Pro using Cleaner5.0.2 using the following settings. There is no de-interlacing, so you don't lose anything. The files increase in size 10 fold, so make sure you have enough HD space. This procedure gives you quicktime movies suitable for use with final cut.

Some have had good results with their decoder cards. compare a few films done both ways on a good monitor with scopes and see which method is best.

If you still have trouble, post your question on our discussion list (moviearchive-subscribe@yahoogroups.com) or write to us at info@archive.org.

Sometimes when I play a movie, the video is choppy or very pixelated. Why is that?

When we encode the video in MPEG-4, we first reduce its size to 320 x 240 � a quarter of the resolution of NTSC video. We then translate it at 350 kbps, which is really borderline for that resolution. You see errors occasionally because there simply isn't enough bandwidth available, so the MPEG-4 encoder either drops frames � resulting in jerky or choppy motion � or drops macro blocks � resulting in blurred or pixelated video. That is the price we pay for the small file size � 80 MB for a 1/2-hour clip is really very small in the digital video world.

Why does this site only offer such high-resolution copies that can't be easily played by everyone?

MPEG-2, a widely accepted standard for video playback, is a full-screen, full-motion compressed video format, most familiar to consumers as the format underlying the digital video disc (DVD) and digital satellite television (DBS). The image quality of MPEG-2 encoded files is far superior to files encoded in other formats, especially low-bandwidth streaming video.

The Archive's goal is to make high-quality video copies of the movies available to everyone. Unlike the thumbnail (less than full-screen, full-motion) quality offered by many sites, whose movies are usually subject to many rights restrictions, our video files can actually be downloaded, recorded to videotape, and displayed on TVs or monitors or even projected. We have sought to prove that the Internet can be a delivery medium for high-quality video without payment or restrictions. The high quality of the video files we offer makes them too large to stream, but technology marches on and this may be possible within the next few years.

How can I search for movies?

You can search from the navigation bar on any page in the Moving Images section of the site. You can also perform a more sophisticated search from the advanced search page.

How did you digitize the films?

Almost all the films in the Internet Moving Images Archive are held (by Prelinger Archives) in original film form (35mm, 16mm, 8mm, Super 8mm, and various obsolete formats like 28mm and 9.5mm). Films were first transferred to Betacam SP videotape, a widely used analog broadcast video standard, on telecine machines manufactured by Rank Cintel or Bosch. The film-to-tape transfer process is not a real-time process: It requires inspection of the film, repair of any physical damage, and supervision by a skilled operator who manipulates color, contrast, speed, and video controls.

The videotape masters created in the film-to-tape transfer suite were then digitized at Prelinger Archives in New York City using an encoding workstation built by Rod Hewitt. The workstation is a 550 MHz PC with a FutureTel NS320 MPEG encoder card. Custom software, also written by Rod Hewitt, drove the Betacam SP playback deck and managed the encoding process. The files were uploaded to hard disk through the courtesy of Flycode, Inc.

The files were encoded at constant bitrates ranging from 2.75 Mbps to 3.5 Mbps. Most were encoded at 480 x 480 pixels (2/3 D1) or 368 x 480 (roughly 1/2 D1). The encoder drops horizontal pixels during the digitizing process, which during decoding are interpolated by the decoder to produce a 720 x 480 picture. (Rod Hewitt's site Coolstf shows examples of an image before and after this process.) Picture quality is equal to or better than most direct broadcast satellite television. Audio was encoded at MPEG-1 Level 2, generally at 112 kbps. Both the MPEG-2 and MPEG-4 movies have mono audio tracks.

To convert the MPEG-2 video to MPEG-4, we used a program called FlasK MPEG. This is an MPEG-1/2 to AVI conversion tool that reads the source MPEG-2 and outputs an AVI file containing the video in MPEG-4 format and audio in uncompressed PCM format. We then use a program called Virtual Dub that recompresses the audio using the MPEG-1 Level 3 (MP3) format. This process is automated by the software that runs the system.

An article on re-coding Prelinger Archive films to SVCD so you can watch them on your DVD player.

See http://www.moviebone.com/

Where can i find more information on how to play movies on the macOS?

See http://www.archive.org/movies/macos.html

Where can i find more information on how to play movies on other operating systems?

For more details, troubleshooting, and how to play movies on other operating systems, see this how to page.

Is there a discussion list for technical issues?

Yes — our list is about both technical issues and movie content. You can subscribe at moviearchive-subscribe@yahoogroups.com.

How can I use the MPEG2 files to make my own movie?

This has been challenging in the past, but we are told that Final Cut Pro on Mac OS-X 10.2 (jaguar) will import the MPEG2 file with the optional MPEG2 plugin module ( http://www.apple.com/quicktime/products/mpeg2playback/ ) Please send a note to moviearchive@yahoogroups.com if it does not.

What about streaming the movies?

You can watch the movies without downloading using RealPlayer from Real Networks (www.real.com). We support two bitrates: 32Kbps-192Kbps for modem and ISDN users plus 256Kbps-450Kbps for DSL and cable-modem users.

Questions

How can I make links clickable in my posts?

How can I format text in my posts

Forums

How can I make links clickable in my posts?

You may have noticed that some posts have highlighted links in them. Internet Archive forums permit the use of HTML codes. Suppose you want to make a link to the Internet Archive home page, one that looks like this: Internet Archive home page. To do this, you would enter the following HTML code: <a href="http://www.archive.org">Internet Archive home page</a>.

How can I format text in my posts

Since the Internet Archive forum system accepts HTML codes, you can make text bold, italic, underlined, or even colored by using normal HTML codes. See WebMonkey for a list of HTML codes.

Questions

What is the Software Archive?

How do you catalogue the records?

How many records does the database have?

How did you scan the images in? At what resolution and with what hardware?

What are the future plans of the Software Archive?

How can I help with this Archive?

Is any technical documentation about the Macromedia collection available?

Software & CD-ROM Archive

What is the Software Archive?

The Software Archive is a group within the Internet Archive, a non-profit organization located in the Presidio of San Francisco. The Archive is attempting to provide universal access to human knowledge, and in doing so, is digitizing, cataloguing, and archiving various creative endeavors. We feel these works should be preserved for future generations for the benefit of our society. The Software Archive started as a collection of CD-ROMs generously donated by Macromedia and we expect the entire Collection to grow to 20,000 titles by the end of 2003.

How do you catalogue the records?

Currently, we are using FileMaker V 5.5 as our database we custom-designed a layout for our uses. Currently, we are cataloguing in the following information: Collection, CD Tag, Box # (where it is stored), product name, ISBN # (if applicable), primary language, product description (general classification/type), platform, copyright date, catalogue date, date of last update, cataloguer, publisher (if applicable), country, email of publisher and date permission was granted to Archive to display the item with our collection.

How many records does the database have?

We currently have 8,595 titles in the database � all comprised of the Macromedia Collection.

How did you scan the images in? At what resolution and with what hardware?

We attempted to scan both the front and back cover artwork, the front cover of any books that might have come with the software and both the covers of any boxes that the software might have come in. Furthermore, we scanned the images in true size without any cropping. We are currently using an HP Scanner ScanJet 7490C and scanning in at its highest resolution, which is True Color (16.7 million colors).

What are the future plans of the Software Archive?

We are currently speaking with Apple Inc. about archiving their collection of customer-donated CD-ROMs that have been collected over the last 7 years (�Made With QuickTime Program). The Macromedia & Apple Collections will consist of over 20,000 CD-ROMs when complete.

How can I help with this Archive?

We are currently looking for help with fundraising to continue our efforts in archiving these great works. To our knowledge, no one else is preserving these types of multimedia and attempting to make it available to the public free of charge. We feel collections like these are part of our cultural heritage and should be preserved for future greater use.

Is any technical documentation about the Macromedia collection available?

Yes, you can download a Microsoft Word file written by Lucille Tang and Lisa Leigh in July of 2002 here.

Questions

How do I view the DJVU books?

How do I view the PDF books?

How do I download a book in tk3 format?

Texts and Books

How do I view the DJVU books?

DJVU is a open format for scanned documents. There are free readers available at:

http://www.lizardtech.com/download/?x=2&p;=1&o;=2&titl;=Download%20DjVu%20Browser%20Plug-in

for windows, mac, linux, mac OS-X, solaris.
Try it. We like this compact, searchable, good looking, and open format.

How do I view the PDF books?

Books that are available in PDF format require Adobe Acrobat. The software is free to download and use.

How do I download a book in tk3 format?

This is a beautiful format, and well worth trying. To download a reader for Windows and Mac (pre OSX) go to http://www.nightkitchen.com/download/reader/index.phtml

Questions

What's the significance of the Archive's collections?

What is the nonprofit status of the Internet Archive? Where does its funding come from?

Does the Archive issue grants?

How do I contact the Internet Archive?

The Internet Archive

What's the significance of the Archive's collections?

Societies have always placed importance on preserving their culture and heritage. But much early 20th-century media -- television and radio, for example -- was not saved. The Library of Alexandria -- an ancient center of learning containing a copy of every book in the world -- disappeared when it was burned to the ground.

What is the nonprofit status of the Internet Archive? Where does its funding come from?

The Internet Archive is a 501(c)(3) public nonprofit organization. It receives in-kind and financial donations from Alexa Internet, the Kahle/Austin Foundation, and Quantum Corporation

Does the Archive issue grants?

No; although we promote the development of other Internet libraries through colloquia, and other means, the Archive is not a grant-making organization.

How do I contact the Internet Archive?

Questions about the Wayback Machine should be addressed to wayback@archive.org. General questions about the Internet Archive, or other archive projects, should be addressed to info@archive.org.

Questions

How can I get my site included in the Archive?

How can I remove my site's pages from the Wayback Machine?

What is the Internet Archive Wayback Machine?

Can I link to old pages on the Wayback Machine?

Why isn't the site I'm looking for in the archive?

What does it mean when a site's archive data has been "updated"?

Who was involved in the creation of the Internet Archive Wayback Machine?

How was the Wayback Machine made?

How large is the Archive?

What type of machinery is used in this Internet Archive?

How do you archive dynamic pages?

Why are some sites harder to archive than others?

Some sites are not available because of robots.txt or other exclusions. What does that mean?

How can I help the Internet Archive and the Wayback Machine?

Can I search the Archive?

When will you offer text search for the Wayback Machine?

Why am I getting broken or gray images on a site?

How do I contact the Internet Archive?

What is the Wayback Machine's Copyright Policy?

Why is the Internet Archive collecting sites from the Internet? What makes the information useful?

Do you archive email? Chat?

Do you collect all the sites on the Web?

Is there any personal information in these collections?

Who has access to the collections? What about the public?

'How can I get a copy of the pages on my Web site? If my site got hacked or damaged, could I get a backup from the Archive?'

Can I download an entire site from the Wayback Machine?

Can people download sites from the collections?

How do you protect my privacy if you archive my site?

The Wayback Machine

How can I get my site included in the Archive?

Alexa Internet has been crawling the web since 1996, which has resulted in a massive archive. If you have a web site, and you would like to ensure that it is saved for posterity in the Internet Archive, and you've searched wayback and found no resuls, you can visit the Alexa's "Webmasters" page at http://pages.alexa.com/help/webmasters/index.html#crawl_site.

Method 2: if you have the Alexa tool bar installed, just visit a site.

Method 3: while visiting a site, use the 'show related links' in Internet Explorer, which uses the Alexa service.

Sites are usually crawled within 24 hours and no more then 48. Crawled sites will be added to Wayback in about 6 months.

How can I remove my site's pages from the Wayback Machine?

The Internet Archive is not interested in preserving or offering access to Web sites or other Internet documents of persons who do not want their materials in the collection. By placing a simple robots.txt file on your Web server, you can exclude your site from being crawled as well as exclude any historical pages from the Wayback Machine.

You can find exclusion directions at exclude.php. If you have further questions, you may email wayback2@archive.org.

What is the Internet Archive Wayback Machine?

The Internet Archive Wayback Machine is a service that allows people to visit archived versions of Web sites. Visitors to the Wayback Machine can type in a URL, select a date range, and then begin surfing on an archived version of the Web. Imagine surfing circa 1999 and looking at all the Y2K hype, or revisiting an older version of your favorite Web site. The Internet Archive Wayback Machine can make all of this possible. See our press release at http://www.archive.org/about/press_release.php.

Can I link to old pages on the Wayback Machine?

Yes! The Wayback Machine is built so that it can be used and referenced. If you find an archived page that you would like to reference on your Web page or in an article, you can copy the URL. You can even use fuzzy URL matching and date specification... but that's a bit more advanced (check out our advanced search page at http://web.archive.org/collections/web/advanced.html).

Why isn't the site I'm looking for in the archive?

Some sites may not be included because the automated crawlers were unaware of their existence at the time of the crawl. It's also possible that some sites were not archived because they were password protected or otherwise inaccessible to our automated systems or because the Web site administrator has requested removal of the site from the Archive. Note: some pages appear in the Election 2000 archive and not the main archive.

What does it mean when a site's archive data has been "updated"?

When our automated systems crawl the web every few months or so, we find that only about 50% of all pages on the web have changed from our previous visit. This means that much of the content in our archive is duplicate material. If you don't see ""*"" next to an archived document, then the content on the archived page is identical to the previously archived copy.

Who was involved in the creation of the Internet Archive Wayback Machine?

"The original idea for the Internet Archive Wayback Machine began in 1996, when the Internet Archive first began archiving the web. Now, five years later, with over 100 terabytes and a dozen web crawls completed, the Internet Archive has made the Internet Archive Wayback Machine available to the public. The Internet Archive has relied on donations of web crawls, technology, and expertise from Alexa Internet and others. The Internet Archive Wayback Machine is owned and operated by the Internet Archive."

How was the Wayback Machine made?

Over 100 terabytes of data are stored on several dozen modified servers. Alexa Internet, in cooperation with the Internet Archive, has designed a three dimensional index that allows browsing of web documents over multiple time periods, and turned this unique feature into the Wayback Machine.

How large is the Archive?

The Internet Archive Wayback Machine contains over 100 terabytes of data and is currently growing at a rate of 12 terabytes per month. This eclipses the amount of text contained in the world's largest libraries, including the Library of Congress. If you tried to place the entire contents of the archive onto floppy disks (we don't recommend this!) and laid them end to end, it would stretch from New York, past Los Angeles, and halfway to Hawaii.

What type of machinery is used in this Internet Archive?

The Internet Archive is stored on dozens of slightly modified Hewlett Packard servers. The computers run on the FreeBSD operating system. Each computer has 512Mb of memory and can hold just over 300 gigabytes of data on IDE disks.

How do you archive dynamic pages?

There are many different kinds of dynamic pages, some of which are easily stored in an archive and some of which fall apart completely. When a dynamic page renders standard html, the archive works beautifully. When a dynamic page contains forms, JavaScript, or other elements that require interaction with the originating host, the archive will not contain the original site's functionality.

Why are some sites harder to archive than others?

If you look at our collection of archived sites, you will find some broken pages, missing graphics, and some sites that aren't archived at all. Here are some things that make it difficult to archive a web site:

Robots.txt -- We respect robot exclusion headers.
Javascript -- Javascript elements are often hard to archive, but especially if they generate links without having the full name in the page. Plus, if javascript needs to contact the originating server in order to work, it will fail when archived.
Server side image maps -- Like any functionality on the web, if it needs to contact the originating server in order to work, it will fail when archived.
Unknown sites -- The archive contains crawls of the Web completed by Alexa Internet. If Alexa doesn't know about your site, it won't be archived. Use the Alexa Toolbar (available at www.alexa.com), and it will know about your page. Or you can visit Alexa's Archive Your Site page at http://pages.alexa.com/help/webmasters/index.html#crawl_site.
Orphan pages -- If there are no links to your pages, the robot won't find it (the robots don't enter queries in search boxes.)

As a general rule of thumb, simple html is the easiest to archive.

Some sites are not available because of robots.txt or other exclusions. What does that mean?

The Standard for Robot Exclusion (SRE) is a means by which web site owners can instruct automated systems not to crawl their sites. Web site owners can specify files or directories that are disallowed from a crawl, and they can even create specific rules for different automated crawlers. All of this information is contained in a file called robots.txt. While robots.txt has been adopted as the universal standard for robot exclusion, compliance with robots.txt is strictly voluntary. In fact most web sites do not have a robots.txt file, and many web crawlers are not programmed to obey the instructions anyway. However, Alexa Internet, the company that crawls the web for the Internet Archive, does respect robots.txt instructions, and even does so retroactively. If a web site owner decides he / she prefers not to have a web crawler visiting his / her files and sets up robots.txt on the site, the Alexa crawlers will stop visiting those files and will make unavailable all files previously gathered from that site. This means that sometimes, while using the Internet Archive Wayback Machine, you may find a site that is unavailable due to robots.txt or other exclusions. Other exclusions? Yes, sometimes a web site owner will contact us directly and ask us to stop crawling or archiving a site, and we endevor to comply with these requests.

How can I help the Internet Archive and the Wayback Machine?

The Internet Archive actively seeks donations of digital materials for preservation. If you have digital materials that may be of interest to future generations, please let us know by submitting a proposal at http://www.archive.org/internet/proposal.html. The Internet Archive is also seeking additional funding to continue this important mission. You may make a donation through the Amazon.com Honor System at http://www.amazon.com/paypage/PFW9L3HMJTPIQ.

Can I search the Archive?

Using the Internet Archive Wayback Machine, it is possible to search for the names of sites contained in the Archive (URLs) and to specify date ranges for your search. However, we do not yet have an indexed text search of the documents in the collection. We continue to work on it and should have a full text search soon.

When will you offer text search for the Wayback Machine?

We do not yet have an indexed text search of the documents in the collection. This is a large and complicated project, but we continue to work on it and should have a full text search soon.

Why am I getting broken or gray images on a site?

Broken images (when there is a small red "x" where the image should be) occur when the images are not available on our servers. Usually this means that we did not archive them. Gray images are the result of robots.txt exclusions. The site in question may have blocked robot access to their images directory.

How do I contact the Internet Archive?

Questions about the Wayback Machine should be addressed to wayback@archive.org. General questions about the Internet Archive, or other archive projects, should be addressed to info@archive.org.

What is the Wayback Machine's Copyright Policy?

The Internet Archive respects the intellectual property rights and other proprietary rights of others. The Internet Archive may, in appropriate circumstances and at its discretion, remove certain content or disable access to content that appears to infringe the copyright or other intellectual property rights of others. If you believe that your copyright has been violated by material available through the Internet Archive, please provide the Internet Archive Copyright Agent with the following information:

Identification of the copyrighted work that you claim has been infringed;
An exact description of where the material about which you complain is located within the Internet Archive collections;
Your address, telephone number, and email address;
A statement by you that you have a good-faith belief that the disputed use is not authorized by the copyright owner, its agent, or the law;
A statement by you, made under penalty of perjury, that the above information in your notice is accurate and that you are the owner of the copyright interest involved or are authorized to act on behalf of that owner;
Your electronic or physical signature.

The Internet Archive Copyright Agent can be reached as follows:

Internet Archive Copyright Agent
Internet Archive
Presidio of San Francisco
P.O. Box 29244
San Francisco, CA 94129
Phone: 415-561-6767
Email: info@archive.org

Why is the Internet Archive collecting sites from the Internet? What makes the information useful?

Most societies place importance on preserving artifacts of their culture and heritage. Without such artifacts, civilization has no memory and no mechanism to learn from its successes and failures. Our culture now produces more and more artifacts in digital form. The Archive's mission is to help preserve those artifacts and create an Internet library for researchers, historians, and scholars. The Archive collaborates with institutions including the Library of Congress and the Smithsonian.

Do you archive email? Chat?

No, we do not collect or archive chat systems or personal email messages that have not been posted to Usenet bulletin boards or publicly accessible online message boards.

Do you collect all the sites on the Web?

No, we collect only publicly accessible Web pages. We do not archive pages that require a password to access, pages tagged for "robot exclusion" by their owners, pages that are only accessible when a person types into and sends a form, or pages on secure servers. If a site owner properly requests removal of a Web site through http://www.archive.org/internet/remove/html, we will remove that site from the Archive.

Is there any personal information in these collections?

We collect Web pages that are publicly accessible. These may include pages with personal information.

Who has access to the collections? What about the public?

The Archive makes the collections available at no cost to researchers, historians, and scholars. At present, it takes someone with a certain level of technical knowledge to access them, but there is no requirement that a user be affiliated with any particular organization.

'How can I get a copy of the pages on my Web site? If my site got hacked or damaged, could I get a backup from the Archive?'

Our terms of use don�t cover backups for the general public. However, you may use the Internet Archive Wayback Machine to locate and access archived versions of your web site. We can't guarantee that your site has been or will be archived.

Can I download an entire site from the Wayback Machine?

We do not currently offer any method to download entire sites from the Wayback Machine.

Can people download sites from the collections?

Our terms of use specify that users of the collections are not to copy data from the collections. If there are special circumstances that you think the Archive should consider, please contact info@archive.org.

How do you protect my privacy if you archive my site?

The Archive collects Web pages that are publicly available � the same ones that you might find as you surfed around the Web. We do not archive pages that require a password to access, pages tagged for "robot exclusion" by their owners, pages that are only accessible when a person types into and sends a form, or pages on secure servers. We also provide information on removing a site from the collections. Those who use the collections must agree to certain terms of use.

Like a public library, the Archive provides free and open access to its collections to researchers, historians, and scholars. Our cultural norms have long promoted access to documents that were, but no longer are, publicly accessible.

Given the rate at which the Internet is changing � the average life of a Web page is only 77 days � if no effort is made to preserve it, it will be entirely and irretrievably lost. Rather than let this moment slip by, we are proceeding with documenting the growth and content of the Internet, using libraries as our model.

If you are interested in these issues, please join and contribute to our announcement and discussion lists.

Questions

I forgot my password, what can I do?

When I attempt to log in using my username and password, I am told that the username or password is invalid. What could be wrong?

What is the difference between a virtual library card and an account?

How do I change my password?

How do I change my screen name?

What happens to my forum posts and movie, software, audio, and book reviews when I change my screen name?

What happens if my email address changes? How can I change my email address?

How can I remove my account?

Virtual Library Cards (AKA Accounts)

I forgot my password, what can I do?

As long as you remember the email address which you originally used when signing up for your virtual library card, you can use this form to have your password emailed to you. Bear in mind that your password will be sent in clear text, which means that anyone who views the email (or anyone with sophisticated "packet sniffing" software) can obtain your password. For this reason you should return to the Internet Archive website once you have your old password and change it to something new.

When I attempt to log in using my username and password, I am told that the username or password is invalid. What could be wrong?

There are several things to keep in mind when you encounter this error.

Your username is your email address, not your screen name. Make sure you enter the same email address that you supplied when signing up for your virtual library card.

Your password is case-sensitive. Check to see if the CAPS-LOCK key is engaged (typically a light would be illuminated on your keyboard).

You might have forgotten your password. If you think this is the case, you can have your password emailed to you here

What is the difference between a virtual library card and an account?

These two terms are used interchangably.

How do I change my password?

You can use this form to change your password.

How do I change my screen name?

You can use this form to change your screen name.

What happens to my forum posts and movie, software, audio, and book reviews when I change my screen name?

Your old reviews and posts will be updated with your new screen name.

What happens if my email address changes? How can I change my email address?

You can use this form to change your email address

How can I remove my account?

You can use this form to remove your account.

FAQ Forum				New Post
Subject	Poster	Replies	Views	Date
Fighting Piracy	aatayyab	0	56	October 01, 2002 12:40:17pm
FAQs are now searchable	Jonathan Aizen	1	58	August 13, 2002 02:45:28pm
you rock!	brewster	0	48	August 14, 2002 12:52:11am
more etree faqs posted	Jonathan Aizen	0	29	August 09, 2002 12:57:58pm
faq updated	lenny	1	31	August 07, 2002 02:23:28pm
Re: faq updated	Jonathan Aizen	0	15	August 07, 2002 02:53:00pm
Beginning to post Virtual Library Card FAQs	Jonathan Aizen	0	26	August 07, 2002 05:52:12am
A few forum FAQs posted	Jonathan Aizen	0	22	August 06, 2002 07:23:12am
A few etree FAQs posted	Jonathan Aizen	0	23	August 05, 2002 08:49:51pm
top links dont work	brewster	1	34	August 05, 2002 07:19:20pm
Re: top links dont work	Jonathan Aizen	1	15	August 05, 2002 07:38:53pm
Re: top links dont work	brewster	2	16	August 05, 2002 08:06:07pm
Re: top links dont work	Jonathan Aizen	0	14	August 06, 2002 07:13:57am
Re: top links dont work	Jonathan Aizen	0	14	August 06, 2002 02:36:26pm
FAQ cleanup	Jonathan Aizen	0	21	August 05, 2002 06:29:56pm
Search engine for FAQs	Jonathan Aizen	1	28	August 05, 2002 06:21:41pm
Re: Search engine for FAQs	brewster	0	13	August 05, 2002 08:07:14pm
FAQs page updated with better navigation	Jonathan Aizen	1	25	August 05, 2002 03:40:44pm
awesome faq's!	brewster	0	24	August 05, 2002 05:56:54pm
FAQs are now deletable	Jonathan Aizen	0	21	August 05, 2002 10:29:49am
FAQs are now editable	Jonathan Aizen	0	25	August 04, 2002 01:16:38pm


	Search:	Anonymous User (login or join us)