4.
Configuring resources
For our purposes, an "application" is any scientific code that takes a text-based parameter file and perhaps one or more data files as input, runs noninteractively, and produces a set of output files, including a text log file and checkpoint files which can be used to restart the code. In order to let Teuthis know how to work with an application, you must configure it using the applications dialog, which is accessed through the Settings menu in the main window. An example application dialog is shown in Figure 1.
Figure 1. Editing application
profiles.
Configured applications are listed in the
scrolled window on the left. Selecting one of them fills the
fields on the right with the configuration data for that
application. New applications can be added to the list by
clicking on the "New" button at the bottom of the dialog.
Applications can be cloned by clicking on the "Clone" button; this will
create a copy of the application with "Copy" appended to the
application name. Applications can also be removed from the list
by clicking the "Remove" button. Action buttons like "New,"
"Clone," and "Remove" take effect immediately and are not cancelled if
the dialog is exited via the "Cancel" button.
The application name is used to populate the
application pulldown list in the experiment dialog, so it is best to
keep it short. It is also used to name the build directory on the
execution machine if the "Build executable" box is checked. The
name may include spaces; these will be "munged" into underscores when
naming directories. (The same happens when using the name of a
job to create a unique run directory.) Two distinct applications
may have the same name, but this will be confusing and may have
unpredictable results.
The description field is purely informative and
is not used by Teuthis at this time.
If your application needs to be built from source
code, Teuthis can manage the configuration and building of the
application's executable file, or else you can log in to each of your
remote execution machines and build it by hand. If Teuthis is to
manage the build, you must check the "Build executable" box. You
will need to select a path on your local machine where the source code
may be found. Ideally this will be a directory into which you
have checked out a copy of the code from a source code repository such
as CVS or Subversion; however, this is not a requirement. You
will also need to specify the name of the configuration command (if
any) and the name of the file containing configuration information (if
any). The configuration command should be as generic as possible
(e.g., ./configure --parallel
); you will have the
opportunity to add arguments within the experiment dialog before
issuing the command for each experiment. For some codes no
configuration is necessary; instead, you might expect to upload a local
makefile named "Makefile.myproject," rename it "Makefile," and then
build the code. In such cases you should leave the "Configure
command" field blank and fill in the "Configuration file" field with
the name the makefile should be given on the remote machine. Then
use the configuration file selection and upload function of the
experiment dialog to choose and upload your makefile.
If Teuthis is managing the build, you should also
fill in the "Build command" field. For most codes this will be make
or gmake
; others may use tools such as ant
or cmake
. By checking the box labeled "Move to
remote exec dir" you can have the executable moved to the executable
directory you have configured for the machine on which the build is to
take place. If you check this box, jobs using this application
will expect to run the executable from the executable directory.
Teuthis will expect the build process to leave the executable in the
build directory for "pickup." If you do not check the box, jobs
will attempt to run the executable straight from the build
directory. In either case the "Executable name" field should
contain just the name of the file that is produced by the build process.
If Teuthis is not managing the build, you should
leave the "Build executable" and "Move to remote exec dir" buttons
unchecked. The "Executable name" field should contain either the
name of the executable if it is in your default path or the absolute
path to the executable if it is not.
If your code is a parallel code, check the
"Parallel executable" box. Job scripts using this application
will prefix your executable's name with the parallel execution command
configured for the execution machine.
Parameter files and restarting
For each job, Teuthis will create a parameter
file in the job's run directory with the name given in the "Parameter
file" field. Your code should expect to find this file in the
directory from which it is run, either by default or through the use of
command-line options (which can be specified in the experiment
dialog). (As with the configuration file, this need not be the
local name you use for your parameter file templates.) Teuthis
will expect your code to generate a log file with the name given in the
"Log file" field. This file will be downloaded and stored within
your project file when you click on the "View log file" button or menu
item associated with jobs using this application.
Teuthis can automatically create restart jobs for
you with a little help from your application. Three methods are
supported at present. By default, Teuthis will expect a control
file with a specific name to exist in the run directory when you ask to
continue a job. If you choose this option and leave the control
file entry blank, Teuthis will not look for any specific file and will
simply create a job that re-executes the original job. It will be
up to your application to detect that it is running a restart
job. The second option is to specify a special command-line
argument (e.g., "--restart-from checkpoint_file
").
The third option is for Teuthis to expect your code to write a special
version of the runtime file after each successful checkpoint. The
name of the file is specified in the space supplied. It should
contain all of the original runtime parameter settings for the run plus
any settings needed to cause the application to restart from the
checkpoint just written. When asked to continue a job, Teuthis
will download this file and use it for the runtime parameter settings
of the restart job.
Settings for common applications
In the table below we list some of the
applications that have been used with Teuthis and the suggested
settings for them. If your application isn't listed and you find
some settings that work, please let us know and we
will add your application to the list. If you are reading this
document as part of your Teuthis distribution, please see the online
user's guide for the latest version of this table. Note that
these suggestions only cover the basic operation of each code; codes
may have additional capabilities that are enabled by configuration or
make arguments. Please see your application's documentation for
any special features.
Application |
Exec
name and args |
Config
file |
Config
command |
Build
command |
Move executable |
Parallel |
Parameter
file |
Log
file |
Auto
restart method |
Notes |
FLASH 2.x |
flash2 |
Modules |
./setup |
gmake |
optional |
yes |
flash.par |
flash.log |
Copy
from flash.par.restart; or command line argument "-chk_file" followed
by manual addition of checkpoint file name |
flash.par.restart
not available in standard distribution; need patch Need to set up site directory for remote site Leave log_file parameter unset |
Gadget
2 |
Gadget2
gadget.param |
Makefile |
N/A |
gmake |
optional |
yes |
gadget.param |
info.txt |
Command
line argument "1" |
Upload
custom makefile as your configuration file Use "." for OutputDir parameter Leave InfoFile parameter unset |
Enzo 1.0.1 |
enzo.exe
EnzoParms |
N/A |
./configure
--bindir=XX |
cd
amr_mpi/src; gmake mach-YY; gmake; gmake install |
yes |
yes |
EnzoParms |
OutputLevelInformation |
Command
line argument "-r"; must manually add name of last restart file |
XX
= absolute path to build directory Need to set up Make.mach.YY file for remote machine; place in config directory |
Hydra 4.2 |
hydra |
makeflags |
N/A |
make
clean; make |
yes |
no |
prun.dat |
pr0001.log |
None;
manually edit prun.dat |
May
need to create a new src/system.YY file for remote machine YY Modify src/dumpdata.F, src/readdata.F, and src/gravsubs.F to read/write to ./ rather than data/ To change array sizes, edit include/psize.inc on local machine and sync source Upload custom makeflags file as your configuration file; set RUNDIR to ".." Use 0001 as run number in prun.dat |
At present, Teuthis does not support dynamic
resource discovery over the Grid. Hence you, the user, must
create profiles for each of the machines you expect to use with
Teuthis. This is done through the machines dialog, which is
accessed through the Settings menu in the main window. An example
is shown in Figure 2.
Patterns
understood by Teuthis in remote command fields |
|||||
---|---|---|---|---|---|
%A |
Account
name |
%a |
Application
arguments |
%b |
Remote
batch file name |
%C |
Remote
change directory command |
%c |
Remote
command |
%D |
Remote
path create command |
%d |
Remote
run path |
%e |
Path
to remote executable |
%f |
"From"
file in file transfers |
%H |
Wall
clock time limit hours |
%h |
Remote host; or "from" host in third-party transfers | %i |
"To"
host in third-party transfers |
%j |
Remote
job identifier |
%K |
Local
certificate file name |
%k |
Local
key file name |
%L |
Remote
job name |
%l |
Proxy
lifetime in hours |
%M |
Wall
clock time limit minutes |
%m |
Memory
per node (MB) |
%N |
Number
of nodes |
%n |
Number
of CPUs |
%P |
Remote
parallel execution command |
%p |
Remote
password |
%Q |
Queue
name |
%R |
Kerberos realm | %r |
"To" path in file transfers | %S |
Wall clock time limit seconds |
%T |
CPU tiling (CPUs/node) | %t |
Wall clock time limit (%H:%M) | %u |
Remote userid; or "from" userid in third-party transfers |
%v |
"To" userid in third-party transfers | %z |
Remote
directory for Teuthis files |
ssh
to
execute remote commands and scp
to transfer files.
Authentication is via password. You will have the option upon
entering your password to save the password in Teuthis during the
session in which you enter it (passwords are never written to project
files). While this is not very secure, without it you will be
typing your password many, many times. A better choice is to use
one of the authentication methods listed below. Third-party file
transfers are not supported with ssh. To use this or any of the
other ssh* options, you must have ssh
installed on your
computer.ssh
with RSA authentication. To use this, you must generally create a
public key-private key pair on the machine on which you run
Teuthis. The public key is then copied into the .ssh/authorized_keys
file on machines to which you would like to connect without a
password. Your keys should be generated without a
passphrase. This option is better than plain ssh, but not as good
as the ssh-agent method. On your local machine, the keys are
protected only by your filesystem's access permissions.ssh
with RSA authentication, but encrypt your private key using a
passphrase. A program called ssh-agent
manages your
keys on your local machine. You use the ssh-add
command to add private keys to ssh-agent
's store; then
when you access the remote machine, ssh-agent
tries each
of the keys it is managing. The remote machine should be
configured with your public key as with the ssh-rsa method. This
is the most secure method to use with ordinary ssh.gsiscp
is used for file transfers,
and third-party file transfers are supported. This is the easiest
Grid method to set up, but it also offers the poorest file transfer
performance.gsissh
for remote command execution and globus-url-copy
for file
transfers. Third-party file transfers are supported.gsissh
for remote command execution and uberftp
for third-party
file transfers. (gsiscp
is used for uploads and
downloads because it can preserve file permissions.) You must
install the uberftp
package for this option to work.rsh
for remote command execution and rcp
for file transfers. You must enter the name of the Kerberos realm
for this machine in the "Realm" field. The versions of rsh
and rcp
on your system must support Kerberos
authentication. Third-party file transfers are not supported at
present. Kerberos ticket checking is also not supported at
present, so you will periodically need to shut down and restart Teuthis
to use this method./run_root/project_name/experiment_name/run_name/job_name