In an effort to balance the use of the hal cluster:
hal#and are expected to log into that node, but they may log into any node in their pool.
#is the user's assigned node. Users may, however, use disk space
/hal#.tmp/usernameon any node in their pool.
qsubcommand should be used.
/hal#.tmp2disks in their pool, but note that these disk are scrubbed once per month and ALL FILES OLDER THAN ONE MONTH ARE DELETED.
/hal#/tmp2) may reqeust so.
|user||login node||disk space||primary pool||secondary pool(s)|
|jonjon (Andy Majda)||hal01||/hal01.tmp/majda||A||all nodes but hal02, hal03|
|crommelin (Daan Crommelin)||hal01||/hal01.tmp/crommelin||A||"|
|holland (David Holland)||hal02||/hal02.tmp/holland||B||"|
|cshaji (C. Shaji)||hal03||/hal03.tmp/cshaji||B||"|
|epab (Povl Abrahamsen)||hal03||/hal03.tmp/epab||B||"|
|buhler (Oliver Buhler)||hal04||/hal04.tmp/buhler||C||"|
|reddy (Tasha Reddy)||hal04||/hal04.tmp/reddy||C||"|
|tabak (Esteban Tabak)||hal05||/hal05.tmp/tabak||D||"|
|tea (Todd Arbetter)||hal05||/hal05.tmp/tea||D||"|
|grote (Marcus Grote)||hal05||/hal05.tmp/grote||D||"|
|kleeman (Richard Kleeman)||hal06||/hal06.tmp/kleeman||E||"|
|ytang (Youmin Tang)||hal07||/hal07.tmp/ytang||E||"|
|pauluis (Olivier Pauluis)||hal07||/hal07.tmp/pauluis||E||"|
|shafer (Shafer Smith)||hal08||/hal08.tmp/shafer||F||"|
|xuemin (Xuemin Tu)||hal08||/hal08.tmp/xuemin||F||"|
|givelbrg (Ed Givelberg Smith)||hal08||/hal08.tmp/givelbrg||F||"|
|desteur (Laura de Steur)||hal09||/hal09.tmp/desteur||B||"|
|abramov (Rafail Abramov)||hal10||/hal10.tmp/abramov||A||"|
|franzke (Christian Franzke)||hal10||/hal10.tmp/franzke||A||"|
|kleeman (Richard Kleeman)||hal11||/hal11.tmp/kleeman||E||"|
|jenkins (Adrian Jenkins)||hal12||/hal12.tmp/jenkins||B||"|
|thoma (Malte Thoma)||hal13||/hal13.tmp/thoma||B||"|
|smedsrud (Lars Smedsrud)||hal13||/hal13.tmp/smedsrud||B||"|
|kleeman (Richard Kleeman)||hal14||/hal14.tmp/kleeman||E||"|
|barreiro (Andrea Barreiro)||hal15||/hal15.tmp/barreiro||G||"|
|dgoldberg (Dan Goldberg)||hal15||/hal15.tmp/dgoldberg||G||"|
|konigc (Chris Konig)||hal15||/hal15.tmp/konigc||G||"|
|tulloch (Ross Tulloch)||hal15||/hal15.tmp/tulloch||G||"|
|walkerr (Ryan Walker)||hal16||/hal16.tmp/walkerr||G||"|
|schaffri (Helga Schaffrin)||hal16||/hal16.tmp/schaffri||G||"|
|saverio (Saverio Spagnolie)||hal16||/hal16.tmp/saverio||G||"|
No files on hal are currently backed up! It is the user's responsibility to have all critical files copied to some other system where backups are performed (e.g., math1.cims.nyu.edu).
To run "q" commands (
qsub, etc.), those users whose GID (group-id) is not already defined in the YP (yellow-pages) groups will need to run the following command in a window on one of the hal nodes before running one of the "q" commands.
The user should then be able to run "q" commands in the window from which that command was issued (or any child process windows).
Users whose GID is already defined in the YP groups will not need to run the
newgrp command. This would be true of any users whose GID is 1000,4000,5000, or 6000. Many users, however, have GIDs assigned which are the same as their UIDs -- they would have to run the
newgrp command. The best way to find out is to simply try.
To submit a job to a queue, create a shell script with the commands you wish to have executed. On the hal system, you should include some coding, such as the following, at the very beginning of your shell script:
#----------------------------------------------------------------------- # switches: SUN queue directives # (leader characters are '#$ ') #----------------------------------------------------------------------- # # set shell as bourne, csh, etc #$ -S /bin/your_favorite_shell # define user queue #$ -q your_hal_node.q # execute in current working directory #$ -cwd # export environmental variables #$ -V # job name #$ -N your_job_name # send standard output to specific file #$ -o $JOB_NAME.$JOB_ID # merge error output with standard output #$ -j y # define list of users for email notification #$ -M email@example.com # send email on job end and suspension #$ -m es # #----------------------------------------------------------------------
To execute your commands, you submit your job script via the
qsub command. More details are found in the man pages for
As an example of submiting a job to the appropriate primary and secondary queues, user
holland would submit
qsub -q hal02.q, hal03.q, hal09.q, hal12.q, hal13.q, hal01s.q, hal04s.q, hal05s.q, hal06s.q, hal07s.q, hal08s.q, hal10s.q, hal11s.q, hal14s.q, hal15s.q, hal16s.q (or some subset thereof).
To check the status of your job script, you issue the
qstat command. Again, more details are found in the appropriate man pages. The
qstat command informs you of the various jobs running on the hal system. Currently, the queues available are identified as
hal#.q. If you do not explicitly specify your queue, the system picks one for you from your group (A or B).
To remove an undesired job from the system queue, issue the
qdel command. More details are found in the appropriate man pages.
Waiting for the compilation of all the individual program fragments of a large program can be an onerous task. The use of the
dmake command can cause a makefile to be distributed over any number of the hal nodes, thus markedly speeding up the compilation process. See the man pages for
dmake for further information. To enable
dmake on hal, follow these steps:
.dmakercfile in your home directory that contains the name of the nodes on which you want your
dmaketo run. Here is an example
.dmakercfor the hal cluster.
rshlogin access to all the hal nodes. You will need to have a
.rhostsfile. In the example provided, you need to substitute your actual user name for
your_usr_name. Additionally, the space between the platform name and the user name, must either be a single blank space or a single tab space.
rshby issuing a test command, for example,
rsh hal01.cims.nyu.edu date
The SunOS includes a debugger with a GUI interface. It is invoked by issuing the command
prism a.out. It assumes that, for FORTRAN, you have built your executable, i.e.,
a.out, using parallel compilation (even if you only intend to run your code in serial fashion). To compile any file in your code in parallel mode, issue the command
mpf95 -c your_file.f. You must link all your object files into a single executable using the command
mpf95 -o a.out *.o.
Some popular graphics packages available on the hal cluster are:
/usr/local/pkg/ferreton all nodes.
/usr/local/pkg/GrADSon all nodes.
/usr/local/binon all nodes.
The (64-bit) netCDF library
libnetcdf.a is available under the path
/usr/local/pkg/netcdf-64/lib. The Fortran-90 module
netcdf.mod is available under the path
/user/local/pkg/netcdf-64/src/f90. Details on using netCDF in a Fortran-90 programming environment are available here.
The most up-to-date verison of netCDF can be downloaded here. For the hal system, the user must set the following environment variable prior to running the
setenv CFLAGS "-xarch=v9 -xtarget=ultra3"
setenv CXXFLAGS "-xarch=v9 -xtarget=ultra3"
setenv FFLAGS "-xarch=v9 -xtarget=ultra3"
setenv F90FLAGS "-xarch=v9 -xtarget=ultra3"
ncview command is available under the path
/usr/local/bin. This command is suitatble for giving a quick look at the contents of a netCDF file. Further details are available here.
Some parallelization techniques available on the hal cluster are:
© David Holland.
All Rights Reserved.
If you would like further information
concerning any of the above topics
please send email .