The Internet Archive is not interested in offering access
to Web sites or other Internet documents whose authors
do not want their materials in the collection. To remove
your site from the Wayback Machine, place a robots.txt
file at the top level of your site (e.g. www.yourdomain.com/robots.txt)
and then submit your site below.
The robots.txt file will do two things:
- It will remove all documents from your domain from
the Wayback Machine.
- It will tell us not to crawl your site in the future.
To exclude the Internet Archive's crawler (and remove
documents from the Wayback Machine) while allowing all
other robots to crawl your site, your robots.txt file
should say:
User-agent: ia_archiver
Disallow: /
Robots.txt is the most widely used method for controlling
the behavior of automated robots on your site (all major
robots, including those of Google, Alta Vista, etc. respect
these exclusions). It can be used to block access to the whole domain, or any file or directory within. There are a large number of resources
for webmasters and site owners describing this method
and how to use it. Here are a few:
Once you have put a robots.txt file up, submit your site
(www.yourdomain.com) on the form on http://pages.alexa.com/help/webmasters/index.html#crawl_site.
The robots.txt file must be placed at the root of your domain (www.yourdomain.com/robots.txt). If you cannot put a robots.txt file up, submit a request to
wayback2@archive.org.