BasicTests
From Begrid Wiki
Basic Tests
From your CE (let's call it ce.mysite.ac.be):
Simple job submission
- login as e.g. betest000:
su - betest000
- make a small shell script test.sh that contains:
#!/bin/bash hostname
- change permissions:
chmod 755 test.sh
- run:
qsub -q betest test.sh
- it should give you 2 files similar to the ones below and test.sh.oXXXXX should contain the hostname of the WN on which the job has run
test.sh.e104754 test.sh.o104754
Testing stagein
- still as betest000
- make local tmp dir
mkdir tmp
- make small shell script test_stagein.sh that contains:
#!/bin/sh queue=${1:-betest} base=stagein-<tt>date +%Y%m%d_%H%M%S</tt> out=$PWD/tmp/$base.out err=$PWD/tmp/$base.err dat=$PWD/tmp/$base.dat job=$PWD/tmp/$base.job echo test successful > $dat cat > $job << EOF #!/bin/sh # #PBS -S /bin/sh #PBS -m n #PBS -q $queue #PBS -o $out #PBS -e $err #PBS -r n #PBS -W stagein=$dat@<tt>hostname</tt>:$dat #PBS -l nodes=1 hostname cat $dat EOF {| border=1 class="simple" ! exit |} sleep 5 while qstat $jid 2> /dev/null do sleep 5 done echo Output: echo ======= cat $out echo ======= echo '' echo Errors: echo ======= cat $err echo ======= echo '' {| border=1 class="simple" ! echo test failed |}
- change permissions:
chmod 755 test_stagein.sh
- run:
./test_stagein.sh betest
- if the test fails look at this page for solutions http://goc.grid.sinica.edu.tw/gocwiki/Unspecified_gridmanager_error
- remove tmp dir
Testing local information system
- run now the following command (replace ce.mysite.ac.be and BEgrid-MYSITE by your actual CE and sitename, respectively):
ldapsearch -x -H ldap://ce.mysite.ac.be:2170 -b mds-vo-name=resource,o=grid | grep GlueClusterService
- this should show you similar lines depending on which VOs you are actually supporting
GlueClusterService: ce.mysite.ac.be:2119/jobmanager-pbs-beapps GlueClusterService: ce.mysite.ac.be:2119/jobmanager-pbs-becms GlueClusterService: ce.mysite.ac.be:2119/jobmanager-pbs-betest GlueClusterService: ce.mysite.ac.be:2119/jobmanager-pbs-cms GlueClusterService: ce.mysite.ac.be:2119/jobmanager-pbs-dteam GlueClusterService: ce.mysite.ac.be:2119/jobmanager-pbs-hone GlueClusterService: ce.mysite.ac.be:2119/jobmanager-pbs-ops
From the belnet UI (gridy11.begrid.be):
- run the same command as on the CE to be sure that the site is seen from outside:
ldapsearch -x -H ldap://ce.mysite.ac.be:2170 -b mds-vo-name=resource,o=grid | grep GlueClusterService
- you should see the same output as above, if not check that the necessary ports are open especially the port 2170 in this case
- run:
voms-proxy-init --voms betest
- the following command should return the CE hostname
globus-job-run ce.mysite.ac.be:2119/jobmanager-fork /bin/hostname
- and this one should return the WN hostname of which the job has run on
globus-job-run ce.mysite.ac.be:2119/jobmanager-pbs -queue betest /bin/hostname
- if you get GRAM Job submission failed because the gatekeeper failed to find the requested service (error code 93), make sure that jobmanager service you try to use is found as a GlueClusterService without queue name.
- eg for GlueClusterService: ce.mysite.ac.be:2119/jobmanager-pbs-betest you must use globus-job-run ce.mysite.ac.be:2119/jobmanager-pbs -queue betest /bin/hostname
Special tests for CREAMCE
http://grid.pd.infn.it/cream/field.php?n=Main.HowToCheckAndTestYourCREAMCE
Back to BEgrid_And_Quattor page