PDSF - General

From Computing@RNC

Jump to: navigation, search

Contents

What is PDSF?

PDSF Stands for Parallel Distributed Systems Facility. It is a computer farm, where one can process a large number of similar computing tasks in parallel. For more information, please see http://www.nersc.gov/nusers/systems/PDSF/. It is located at NERSC in Oakland, CA.

How do I contact the PDSF admin team in case of a problem?

You can always send email to consult@nersc.gov, or submit a ticket through the web interface:

https://help.nersc.gov

Then follow "Ask nersc Consultants".

My password no longer works at PDSF

  • I miss-typed the password three (or more) times

For security reasons, NERSC will *lock out* an account that has three or more failed login attempts. The lock out will last *12 hours*. If this happens to you, contact NERSC account support (1-800-666-3772, option #2) to have it reset.

  • It's been a while, and I can't remember my password

Please call NERSC account support (1-800-666-3772, option #2) to have your password reset. Note, if you have not logged in for a long time (~6 months) your account may be deactiviated. If this is the case, NERSC account support will ask that you re-submit your signed NERSC User Agreement.


How to create & access individual web content on PDSF

  • Current & Future Model
  * static content put under group writeable area: /project/projectdirs/star/www/ 
  * accessed by http://portal.nersc.gov/project/star/
  * Please add your own user area subdirectory - e.g. http://portal.nersc.gov/project/star/username 
  * there will not be a system wide migration;  each user should migrate their own web area.
  * static html only (e.g. dynamic content must be pre-generated )
  • Old model is Deprecated
  * static content put into $HOME/public_html 
  * accessible via http://pdsfweb01.nersc.gov/~username

How to use IO resources of networked file systems (*eliza*)

The networked file systems on PDSF are visible from both interactive (pdsf.nersc.gov) and batch nodes. Batch processes should always specify an IO resource in the job description. The star scheduler handles this more or less automatically. For explicit job submission, use:


    qsub -hard -l elizaXXio=1 [script]


Where -l elizaXXio=1 identifies the network resources IO (XX should be a number of the eliza system) being accessed by the job and assigns a resource limit of 1. Failure to supply resource limits explicitly can cause your jobs to take a larger fraction of an IO resource, degrading it's overall performance to the detriment of everyone.

Users who abuse the limits will have their use of the system limited more directly. If you over-specify your resource needs, your job will likely not run. For example, if you just always submit with say, -l eliza1io=1,eliza8io=1,eliza9io=1, because you've used those resources in the past, you will find that your jobs can wait in the queue for a long time until all of those resources become free at the same time.

For more information about setting io resource usage, please see this PDSF FAQ entry.

HPSS Information

For a good overview of HPSS, see [ http://www.nersc.gov/nusers/systems/HPSS/ NERSC-HPSS pages ]. Some additional documentation of the HSI command line interface can be found from Gleicher Enterprises

Some examples

Querying total filesize of a file type in a directory

   pdsf1 56% hsi -P "du /nersc/projects/starofl/embedding/production_dAu/Piminus_207_1233091609/P08ie.SL08f/2008/024/*event.root"
   ----------------------- 41099121	total 512-byte blocks, 74 Files (21,042,749,845 bytes)


SGE Questions

For a good overview, please see the [ http://www.nersc.gov/nusers/systems/PDSF/software/SGE.php PDSF SGE page ]. There are several tools available to monitor your jobs. The qmon command is a graphical interface to SGE which can be quite useful if your network connection is good. Try the inline commands 'sgeuser' and 'qstat' for over-all farm status and your individual job listings, respectively and are discussed in the overview page linked above.

How Can I tell if an SGE Consumable Resource is used up?

Consumable resources in SGE on PDSF (known as complexes) are configured globally but are requested by host. Thus to determine if a resource is available you can use the SGE qhost command specified by any host known to the system. SGE will report the value of that resource common to any host.

  qhost -F eliza8io,eliza9io,eliza13io -h pc1008

will produce output showing the availability of these io resources. If any are 0.0000, then the resource is being used up. If any are greater than 0, then new jobs can access these resources.

Checking all resources with supplied script

We put together a simple script to check the resources explicitly. It is installed on PDSF:

/usr/common/nsg/bin/getIOlimits.pl 

and is run without arguments and produces:

pdsfgrid1 105% getIOlimits.pl
SGE GlobalResource: Total eliza7io = 200   	 Available=200    	   Used= 0.0 %
SGE GlobalResource: Total hpssio = 200   	 Available=199    	   Used= 0.5 %
SGE GlobalResource: Total eliza4io = 200   	 Available=57    	   Used= 71.5 %
SGE GlobalResource: Total scpidl = 25   	 Available=25    	   Used= 0.0 %
SGE GlobalResource: Total eliza12io = 150   	 Available=150    	   Used= 0.0 %
SGE GlobalResource: Total eliza11io = 150   	 Available=150    	   Used= 0.0 %
SGE GlobalResource: Total projectio = 500   	 Available=500    	   Used= 0.0 %
SGE GlobalResource: Total eliza13io = 150   	 Available=0    	   Used= 100.0 %
SGE GlobalResource: Total eliza3io = 150   	 Available=56    	   Used= 62.7 %
SGE GlobalResource: Total eliza1io = 0   	 Available=0    	   Used= 100.0 %
SGE GlobalResource: Total eliza10io = 100   	 Available=100    	   Used= 0.0 %
SGE GlobalResource: Total eliza8io = 175   	 Available=175    	   Used= 0.0 %
SGE GlobalResource: Total snapidl = 768   	 Available=768    	   Used= 0.0 %
SGE GlobalResource: Total eliza9io = 200   	 Available=0    	   Used= 100.0 %


Batch jobs: Local scratch space ($SCRATCH)

Each node has local disk storage associated with it, through $SCRATCH. It is recommended that users read and write to the scratch area while their jobs is running, then copy their output files to the final destination (either HPSS or GPFS disk).

SGE, the batch queue system, maintains a unique disk area for each job as scratch. The environment variable $SCRATCH is mapped to this area for each individual job. This means that users do not have to worry about their jobs running on different cores of one node interfering with each other.

It's important to remember that SGE removes this directory as soon as the job is complete. If you want to keep any ouput files, your job will need to archive those files before exiting.

Batch jobs: I get an error when I try to create a directory under /scratch. What do I do?

The local scratch area is now managed by SGE, and users *cannot* create and maintain their own directories on /scratch. The disk area you can write to is pointed to by the env variable $SCRATCH or $TMPDIR. Please use these instead of /scratch/$username:

#!/bin/sh

mudstfile = $1
cd $SCRATCH
pwd
root4star -q ~/analysis/macros/myAnalysis.C $mudstfile
mv myoutput.root $mudstfile.analysis.root
hsi "cd analysis; prompt; mput $mudstfile.analysis.root"

Has the output:

/scratch/1135296.1.starprod.64bit.q
Warning in <TEnvRec::ChangeValue>: duplicate entry <Library.TMCParticle=libEGPythia6.so 
libEG.so libGraf.so libVMC.so> for level 0; ignored
  *******************************************
  *                                         *
  *        W E L C O M E  to  R O O T       *
  *                                         *
  *   Version  5.12/00f   23 October 2006   *
  *                                         *
  *  You are welcome to visit our Web site  *
  *          http://root.cern.ch            *
  *                                         *
  *******************************************

FreeType Engine v2.1.9 used to render TrueType fonts.
Compiled on 23 July 2008 for linux with thread support.

CINT/ROOT C/C++ Interpreter version 5.16.13, June 8, 2006
Type ? for help. Commands must be C++ statements.
Enclose multiple statements between { }.
*** Float Point Exception is OFF ***
 *** Start at Date : Thu Oct 15 11:08:59 2009
QAInfo:You are using STAR_LEVEL : new, ROOT_LEVEL : 5.12.00 and node : pdsf3

[clip]

***********************************************************************
*              NERSC HPSS User SYSTEM (archive.nersc.gov)             *
***********************************************************************
Username: aarose  UID: 34500  Acct: 34500(34500) Copies: 1 Firewall: off [hsi.3.4.3 Thu Jan 29 16:10:54 PST 2009][V3.4.3_2009_01_28.05] 
A:/home/s/starofl-> 
[clip]

Batch Jobs: My Jobs run on interactive node but fail with a odd error in batch ?

If your jobs run on an interactive node but not in batch, you want to test your environment on the batch node. One step is to run in an interactive batch queue. SGE provides this via the qsh command:

pdsf4 51% qsh
Your job 6126093 ("INTERACTIVE") has been submitted
waiting for interactive job to be scheduled ...
Your interactive job 6126093 has been successfully scheduled.

This will pop up and xterm on a batch node in which you can check your environment & use for testing the job directly. You can specify resources directly from the command line as is done via qsub. For more general information about interactive jobs, see: Running Interactive Batch Jobs

Of course, don't hesitate to ask for help from NERSC support by sending email to consult@nersc.gov .

common issue: memory limit in batch

One difference between interactive processes and batch jobs (even interactive batch) is the memory limits imposed by SGE. By default, SGE puts a 1GB memory limit on your jobs. If this is the problem, running the job in qsh will still show the problem. To override you can specify either on the qsub command line or in your $HOME/.sge_request file:

 -l h_vmem=2G 

This will increase the limit to 2GB. You can look at the setting for recently completed jobs from the accounting information provided by SGE. For example, looking at recent jobs run by the user starofl,

pdsf4 66% qacct -o starofl -j | grep maxvmem 
maxvmem      2.000G
maxvmem      2.000G
maxvmem      2.000G
....

How to retrieve SGE info for jobs that have finished

Accounting information can be obtained using the SGE qacct command which by defaut queries the SGE accounting file $SGE_ROOT/default/common/accounting. Since on PDSF, the accounting file is rotated, you will need to point to an specific accounting file to query your job. First, find the accounting file by date,

 ls $SGE_ROOT/default/common/accounting.*

And then query the file by:

 qacct -j yourjobid -f $SGE_ROOT/default/common/accounting.yourjobrundate


If I can't find an answer to my question here, where else can I go?

It's likely that your question about PDSF is answered on the PDSF page [1], the PDSF FAQ[2] or the NERSC page [3].


Debug data:
Personal tools