PDSF - General
From Computing@RNC
What is PDSF?
PDSF Stands for Parallel Distributed Systems Facility. It is a computer farm, where one can process a large number of similar computing tasks in parallel. For more information, please see http://www.nersc.gov/nusers/systems/PDSF/. It is located at NERSC in Oakland, CA.
How do I contact the PDSF admin team in case of a problem?
You can always send email to consult@nersc.gov, or submit a ticket through the web interface:
Then follow "Ask nersc Consultants".
My password no longer works at PDSF
- I miss-typed the password three (or more) times
For security reasons, NERSC will *lock out* an account that has three or more failed login attempts. The lock out will last *12 hours*. If this happens to you, contact NERSC account support (1-800-666-3772, option #2) to have it reset.
- It's been a while, and I can't remember my password
Please call NERSC account support (1-800-666-3772, option #2) to have your password reset. Note, if you have not logged in for a long time (~6 months) your account may be deactiviated. If this is the case, NERSC account support will ask that you re-submit your signed NERSC User Agreement.
How to create & access individual web content on PDSF
- Current & Future Model
* static content put under group writeable area: /project/projectdirs/star/www/ * accessed by http://portal.nersc.gov/project/star/ * Please add your own user area subdirectory - e.g. http://portal.nersc.gov/project/star/username * there will not be a system wide migration; each user should migrate their own web area. * static html only (e.g. dynamic content must be pre-generated )
- Old model is Deprecated
* static content put into $HOME/public_html * accessible via http://pdsfweb01.nersc.gov/~username
How to use IO resources of networked file systems (*eliza*)
The networked file systems on PDSF are visible from both interactive (pdsf.nersc.gov) and batch nodes. Batch processes should always specify an IO resource in the job description. The star scheduler handles this more or less automatically. For explicit job submission, use:
qsub -hard -l elizaXXio=1 [script]
Where -l elizaXXio=1 identifies the network resources IO (XX should be a number of the eliza system) being accessed by the job and assigns a resource limit of 1. Failure to supply resource limits explicitly can cause your jobs to take a larger fraction of an IO resource, degrading it's overall performance to the detriment of everyone.
Users who abuse the limits will have their use of the system limited more directly. If you over-specify your resource needs, your job will likely not run. For example, if you just always submit with say, -l eliza1io=1,eliza8io=1,eliza9io=1, because you've used those resources in the past, you will find that your jobs can wait in the queue for a long time until all of those resources become free at the same time.
For more information about setting io resource usage, please see this PDSF FAQ entry.
HPSS Information
For a good overview of HPSS, see [ http://www.nersc.gov/nusers/systems/HPSS/ NERSC-HPSS pages ]. Some additional documentation of the HSI command line interface can be found from Gleicher Enterprises
Some examples
Querying total filesize of a file type in a directory
pdsf1 56% hsi -P "du /nersc/projects/starofl/embedding/production_dAu/Piminus_207_1233091609/P08ie.SL08f/2008/024/*event.root" ----------------------- 41099121 total 512-byte blocks, 74 Files (21,042,749,845 bytes)
SGE Questions
For a good overview, please see the [ http://www.nersc.gov/nusers/systems/PDSF/software/SGE.php PDSF SGE page ]. There are several tools available to monitor your jobs. The qmon command is a graphical interface to SGE which can be quite useful if your network connection is good. Try the inline commands 'sgeuser' and 'qstat' for over-all farm status and your individual job listings, respectively and are discussed in the overview page linked above.
How Can I tell if an SGE Consumable Resource is used up?
Consumable resources in SGE on PDSF (known as complexes) are configured globally but are requested by host. Thus to determine if a resource is available you can use the SGE qhost command specified by any host known to the system. SGE will report the value of that resource common to any host.
qhost -F eliza8io,eliza9io,eliza13io -h pc1008
will produce output showing the availability of these io resources. If any are 0.0000, then the resource is being used up. If any are greater than 0, then new jobs can access these resources.
Checking all resources with supplied script
We put together a simple script to check the resources explicitly. It is installed on PDSF:
/usr/common/nsg/bin/getIOlimits.pl
and is run without arguments and produces:
pdsfgrid1 105% getIOlimits.pl SGE GlobalResource: Total eliza7io = 200 Available=200 Used= 0.0 % SGE GlobalResource: Total hpssio = 200 Available=199 Used= 0.5 % SGE GlobalResource: Total eliza4io = 200 Available=57 Used= 71.5 % SGE GlobalResource: Total scpidl = 25 Available=25 Used= 0.0 % SGE GlobalResource: Total eliza12io = 150 Available=150 Used= 0.0 % SGE GlobalResource: Total eliza11io = 150 Available=150 Used= 0.0 % SGE GlobalResource: Total projectio = 500 Available=500 Used= 0.0 % SGE GlobalResource: Total eliza13io = 150 Available=0 Used= 100.0 % SGE GlobalResource: Total eliza3io = 150 Available=56 Used= 62.7 % SGE GlobalResource: Total eliza1io = 0 Available=0 Used= 100.0 % SGE GlobalResource: Total eliza10io = 100 Available=100 Used= 0.0 % SGE GlobalResource: Total eliza8io = 175 Available=175 Used= 0.0 % SGE GlobalResource: Total snapidl = 768 Available=768 Used= 0.0 % SGE GlobalResource: Total eliza9io = 200 Available=0 Used= 100.0 %
Batch jobs: Local scratch space ($SCRATCH)
Each node has local disk storage associated with it, through $SCRATCH. It is recommended that users read and write to the scratch area while their jobs is running, then copy their output files to the final destination (either HPSS or GPFS disk).
SGE, the batch queue system, maintains a unique disk area for each job as scratch. The environment variable $SCRATCH is mapped to this area for each individual job. This means that users do not have to worry about their jobs running on different cores of one node interfering with each other.
It's important to remember that SGE removes this directory as soon as the job is complete. If you want to keep any ouput files, your job will need to archive those files before exiting.
Batch jobs: I get an error when I try to create a directory under /scratch. What do I do?
The local scratch area is now managed by SGE, and users *cannot* create and maintain their own directories on /scratch. The disk area you can write to is pointed to by the env variable $SCRATCH or $TMPDIR. Please use these instead of /scratch/$username:
#!/bin/sh mudstfile = $1 cd $SCRATCH pwd root4star -q ~/analysis/macros/myAnalysis.C $mudstfile mv myoutput.root $mudstfile.analysis.root hsi "cd analysis; prompt; mput $mudstfile.analysis.root"
Has the output:
/scratch/1135296.1.starprod.64bit.q Warning in <TEnvRec::ChangeValue>: duplicate entry <Library.TMCParticle=libEGPythia6.so libEG.so libGraf.so libVMC.so> for level 0; ignored ******************************************* * * * W E L C O M E to R O O T * * * * Version 5.12/00f 23 October 2006 * * * * You are welcome to visit our Web site * * http://root.cern.ch * * * ******************************************* FreeType Engine v2.1.9 used to render TrueType fonts. Compiled on 23 July 2008 for linux with thread support. CINT/ROOT C/C++ Interpreter version 5.16.13, June 8, 2006 Type ? for help. Commands must be C++ statements. Enclose multiple statements between { }. *** Float Point Exception is OFF *** *** Start at Date : Thu Oct 15 11:08:59 2009 QAInfo:You are using STAR_LEVEL : new, ROOT_LEVEL : 5.12.00 and node : pdsf3 [clip] *********************************************************************** * NERSC HPSS User SYSTEM (archive.nersc.gov) * *********************************************************************** Username: aarose UID: 34500 Acct: 34500(34500) Copies: 1 Firewall: off [hsi.3.4.3 Thu Jan 29 16:10:54 PST 2009][V3.4.3_2009_01_28.05] A:/home/s/starofl-> [clip]
Batch Jobs: My Jobs run on interactive node but fail with a odd error in batch ?
If your jobs run on an interactive node but not in batch, you want to test your environment on the batch node. One step is to run in an interactive batch queue. SGE provides this via the qsh command:
pdsf4 51% qsh Your job 6126093 ("INTERACTIVE") has been submitted waiting for interactive job to be scheduled ... Your interactive job 6126093 has been successfully scheduled.
This will pop up and xterm on a batch node in which you can check your environment & use for testing the job directly. You can specify resources directly from the command line as is done via qsub. For more general information about interactive jobs, see: Running Interactive Batch Jobs
Of course, don't hesitate to ask for help from NERSC support by sending email to consult@nersc.gov .
common issue: memory limit in batch
One difference between interactive processes and batch jobs (even interactive batch) is the memory limits imposed by SGE. By default, SGE puts a 1GB memory limit on your jobs. If this is the problem, running the job in qsh will still show the problem. To override you can specify either on the qsub command line or in your $HOME/.sge_request file:
-l h_vmem=2G
This will increase the limit to 2GB. You can look at the setting for recently completed jobs from the accounting information provided by SGE. For example, looking at recent jobs run by the user starofl,
pdsf4 66% qacct -o starofl -j | grep maxvmem maxvmem 2.000G maxvmem 2.000G maxvmem 2.000G ....
How to retrieve SGE info for jobs that have finished
Accounting information can be obtained using the SGE qacct command which by defaut queries the SGE accounting file $SGE_ROOT/default/common/accounting. Since on PDSF, the accounting file is rotated, you will need to point to an specific accounting file to query your job. First, find the accounting file by date,
ls $SGE_ROOT/default/common/accounting.*
And then query the file by:
qacct -j yourjobid -f $SGE_ROOT/default/common/accounting.yourjobrundate
If I can't find an answer to my question here, where else can I go?
It's likely that your question about PDSF is answered on the PDSF page [1], the PDSF FAQ[2] or the NERSC page [3].
Debug data: