COAST Security Archive Logo The Mirror Program Page

Overview

In order to have as much information as possible in the archive, we download data from other sites and incorporate it into our overall subject trees. This data is gathered using the mirror program. This program will update the archive so that files in it are mirroring copies of the files on remote machines scattered all over the world. The mirror program can be obtained from one of these sites:

  1. ftp://src.doc.ic.ac.uk/computing/archiving/mirror.
  2. ftp://ftp.th-darmstadt.de/pub/networking/mirror.
  3. ftp://ftp.sun.ac.za/pub/packages/mirror.
The mirror program operates by being told what files to go look for. It will periodically examine all the files on the remote site, and if it discovers any files that we don't have locally, it will download them into the archive. If any files have changed on the remote site, it will download the new files and erase the old existing copy we had locally. Thus our archive is being continually updated by this program so as to have the most up-to-date information.

How does mirroring work?

The mirror program is located in /u/coast2/ftp-admin/mirror/mirror. It is a PERL script which is run by the nightly script. It compares the remote and local directory trees. Files are updated depending on the following conditions:

What is a package?

The key to understanding how to use mirror is the concept of a package. A package is simply a collection of files and directories. Nothing more, nothing less. It can be a single file, or thousands of files and nested sub-directories. A package tells the mirror program what to obtain from the remote site. It specifies exactly a collection of files to obtain. Naturally the files can be specified using wildcards and pattern matching, so this allows many files to be obtained using a simple package.

Why do we need this? Well otherwise it would be a real pain to tell the mirror program to obtain every file related to a particular tool. Consider a typical tool on a remote ftp server. It probably consists of at least a README file, and some archive files, generated by tar and probably compressed using gzip or compress. The author of the tool probably has written it for multiple platforms, so there could be a number of files, with very similar names. Finally, the author may have left an older version of the code there too.

A tool on a remote site could look similar to this:
mytool.README
mytool.solaris-v1.0.tar.Z
mytool.solaris-v1.5.tar.Z
mytool.sunos-v1.0.tar.Z
mytool.sunos-v1.0.tar.Z
mytool.ultrix-v1.5.tar.Z
READ.ME.TOO

So to actually present this tool usefully to our archive users, we must gather all these files. Notice how the most important ones all start with a common name: mytool. Let us assume that this is the name of the actual tool. Well that would also be a good name for the package we are going to tell the mirror program about. We will write description telling the mirror program exactly where to locate these files, which ones to get, which ones to ignore and where to put them locally. We will then name all this information. This will become a package from our point of view. We no longer view these files in isolation - they are part of some larger structure.

Mirror Packages file

The mirror packages file gives information on what packages to mirror. It is located in /u/coast2/ftp-admin/packages. This file consists of a series of records, each giving a specific item to mirror. The format of the entries are as follows (these are the more important fields):
package=
       comment         =-->  <--
       site            =
       remote_dir      =
       local_dir       +
       recurse	       =
       get_patt        =
       exclude_patt    =

As an example, here is a sample entry:

package=legal_bytes
        comment         =--> Newsletter of Emerging Legal Issues  <--
        site            =ftp.eff.org
        remote_dir      =/pub/Publications/E-journals/Legal_Bytes
        local_dir       +mirrors/ftp.eff.org/Legal_Bytes/

Notice how this sample entry does not use all the fields shown above. If a field is not given, it is filled in with a default value. The package field is a unique name for the package. Every package in the file must have a unique package name.

The site field specifies the address of the remote site. This should be striped of any header information (e.g. ftp://) and trailing pathnames.

The remote_dir specifies where to locate the files on the remote server. If only a single file, or a group of files are to be downloaded, this is the directory containing them. If a directory hierarchy is to be mirrored, this is the root of that directory tree on the remote site.

The local_dir entry specifies where to place the files locally in the archive. In our archive, we have split the data into a subject tree and a mirrors tree. The data obtained by the mirror program is placed in the mirrors tree - the path to the files is obtained by concatinating the default path to the mirrors directory (from the defaults file), with the path specified in the package entry.

For example, if the path to the local directory was /u/coast3/ftp/pub, then the above package would place any files and subdirectories downloaded into /u/coast2/ftp/pub/mirrors/ftp.eff.org/.

Mirror Defaults File

The mirror defaults file controls overall aspects of the mirror program.

How to run the mirror program

The mirror program takes a selection of command line arguments, and a package file. The arguments control the behaviour of the program. They override the options specified in the defaults file. The package file is a file consisting of at least one package entry. To run the mirror program, simply have the directory /u/coast2/ftp-admin/mirror in your path. Then execute:

mirror -d ../packages/security-archive.biweekly

The -d option causes the mirror program to report what it is doing on the standard output. This is used to generate the biweekly mail message giving the output of the mirror program.

In this case, the mirror program To mirror a specific package, you must specify that package by name on the command line. The -p option is used for this. The name of the package is whatever it was called in the packages file. This gives a way of conviently telling the mirror program if you want just one specific package mirrored, as against the whole packages file. If you don't want to actually do anything, but just see what would have happened had you mirrored a specific package, use the -n option. As an example:

mirror -n -d ../packages/security-archive.biweekly
To mirror an specific package, you can use the -p option to the mirror program. This takes the name of a package and will update the archive to reflect any changes that may have taken place in that package on the remote site:
mirror -d -p<package1> ../mirrors/security-archive.biweekly
will mirror the package named package1 from the biweekly control file.

_____

O Built by Mark Crosbie and Ivan Krsul.

click here COAST Internal WWW Page --- click here COAST Project page --- click here Purdue CS Dept page

Last Modified: 6 March, 1995.

security-archive@cerias.purdue.edu (COAST Security Archive)