Overview
In order to have as much information as possible in the archive, we
download data from other sites and incorporate it into our overall
subject trees. This data is gathered using the mirror
program. This program will update the archive so that files in it are
mirroring copies of the files on remote machines scattered all over
the world. The mirror program can be obtained from one of these sites:
How does mirroring work?
The mirror program is located in /u/coast2/ftp-admin/mirror/mirror. It is a
PERL script which is run by the nightly
script. It compares the remote and local directory trees. Files are updated depending on the following conditions:
Why do we need this? Well otherwise it would be a real pain to tell the mirror program to obtain every file related to a particular tool. Consider a typical tool on a remote ftp server. It probably consists of at least a README file, and some archive files, generated by tar and probably compressed using gzip or compress. The author of the tool probably has written it for multiple platforms, so there could be a number of files, with very similar names. Finally, the author may have left an older version of the code there too.
So to actually present this tool usefully to our archive users, we must gather all these files. Notice how the most important ones all start with a common name: mytool. Let us assume that this is the name of the actual tool. Well that would also be a good name for the package we are going to tell the mirror program about. We will write description telling the mirror program exactly where to locate these files, which ones to get, which ones to ignore and where to put them locally. We will then name all this information. This will become a package from our point of view. We no longer view these files in isolation - they are part of some larger structure.
package= comment =--> <-- site = remote_dir = local_dir + recurse = get_patt = exclude_patt =
As an example, here is a sample entry:
package=legal_bytes comment =--> Newsletter of Emerging Legal Issues <-- site =ftp.eff.org remote_dir =/pub/Publications/E-journals/Legal_Bytes local_dir +mirrors/ftp.eff.org/Legal_Bytes/
Notice how this sample entry does not use all the fields shown above. If a field is not given, it is filled in with a default value. The package field is a unique name for the package. Every package in the file must have a unique package name.
The site field specifies the address of the remote site. This should be striped of any header information (e.g. ftp://) and trailing pathnames.
The remote_dir specifies where to locate the files on the remote server. If only a single file, or a group of files are to be downloaded, this is the directory containing them. If a directory hierarchy is to be mirrored, this is the root of that directory tree on the remote site.
The local_dir entry specifies where to place the files locally in the archive. In our archive, we have split the data into a subject tree and a mirrors tree. The data obtained by the mirror program is placed in the mirrors tree - the path to the files is obtained by concatinating the default path to the mirrors directory (from the defaults file), with the path specified in the package entry.
For example, if the path to the local directory was /u/coast3/ftp/pub, then the above package would place any files and subdirectories downloaded into /u/coast2/ftp/pub/mirrors/ftp.eff.org/.
How to run the mirror program
The mirror program takes a selection of command line arguments, and a
package file. The arguments control the behaviour of the program. They
override the options specified in the defaults file. The package file is a
file consisting of at least one package entry.
To run the mirror program, simply have the directory
/u/coast2/ftp-admin/mirror in your path. Then execute:
mirror -d ../packages/security-archive.biweekly
The -d option causes the mirror program to report what it is doing on the standard output. This is used to generate the biweekly mail message giving the output of the mirror program.
In this case, the mirror program To mirror a specific package, you must specify that package by name on the command line. The -p option is used for this. The name of the package is whatever it was called in the packages file. This gives a way of conviently telling the mirror program if you want just one specific package mirrored, as against the whole packages file. If you don't want to actually do anything, but just see what would have happened had you mirrored a specific package, use the -n option. As an example:
mirror -n -d ../packages/security-archive.biweeklyTo mirror an specific package, you can use the -p option to the mirror program. This takes the name of a package and will update the archive to reflect any changes that may have taken place in that package on the remote site:
mirror -d -p<package1> ../mirrors/security-archive.biweekly
Built by Mark Crosbie and
Ivan Krsul.
COAST Internal WWW Page ---
COAST Project page ---
Purdue CS Dept page