Advanced Job Submission

1.  Advanced Sandbox Management


There is the possibility to include input sandbox files stored not on the UI, but on a GridFTP server, and, similarly, to specify that files should be transferred to a GridFTP server when the job finish.

Here is an example:

1.1  Choose files

Decide which file will be needed for the job execution, for example:

[user@ui-2 ~]$ file prime
prime: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.2.5,
statically linked, not stripped 

prime is a binary file statically compiled that calculates a sequence of prime numbers.

1.2  List SEs

For the list of the storage elements:

[user@ui-2 ~]$ lcg-infosites --vo gridseed se
Avail Space(Kb) Used Space(Kb)  Type    SEs
----------------------------------------------------------
47628864        86140           n.a     se-1.grid.seed
47628864        86140           n.a     se-1.grid.seed

1.3  Create directory

Create a directory and copy the file using globus-url-copy command

[user@ui-2 ~]$ edg-gridftp-mkdir gsiftp://se-1.grid.seed/tmp/user/
[user@ui-2 ~]$ globus-url-copy -vb  file:///home/user/prime gsiftp://se-1.grid.seed/tmp/user/prime
Source: file:///home/user/
Dest:   gsiftp://se-1.grid.seed/tmp/user/
  prime
       434885 bytes         1.80 MB/sec avg         1.80 MB/sec inst

we copied the file in the /tmp/user directory of the GridFTP server.

1.4  Write script

Write down the script which uses the binary file

  job.sh
 ===========
 #!/bin/sh
 chmod 755 prime
 ./prime
 ===========

1.5  GridFTP files

We specify the files stored in the GridFTP server as GridFTP URI in the InputSandbox attribute. In our case we have the file prime in the /tmp/user directory of the se-1.grid.seed so:

InputSandbox = {"gsiftp://se-1.grid.seed/tmp/user/prime"};

It is also possible to specify a base GridFTP URI with the attribute InputSandboxBaseURI: in this case,files expressed as simple file names or as relative paths will be looked for under that base URI. Local files can still be defined using the file://<path> URI format. For example:

InputSandbox = {"prime", "file:///home/user/test};
InputSandboxBaseURI = "gsiftp://se-1.grid.seed/tmp/user";

is equivalent to

InputSandbox = {"gsiftp://se-1.grid.seed/tmp/user/prime",
                "/home/user/test"};

In order to store the output sandbox to a GridFTP the OutputSandboxDestURI attribute must be used together with the usual OutputSandbox attribute. The latter is used to list the output files created by the job in the WN to be transferred, and the former is used to express where the output files are to be trasferred. For example:

OutputSandbox = {"userout.log","usererr.log"};
OutputSandboxDestURI = {"gsiftp://se-1.grid.seed/tmp/user/userout.log",
"gsiftp://se-1.grid.seed/tmp/user//usererr.log"};

In this case clearly, glite-wms-job-output, when the job has finished will not retrieve no results, because they will be at GridFTP server.

Another possibility is to use the OutputSandboxBaseDestURI attribute to specify a base URI on a GridFTP server where the files listed in OutputSandbox will be copied. For example:

OutputSandbox = {"userout.log", "usererr.log"};
OutputSandboxBaseDestURI = "gsiftp://se-1.grid.seed/tmp/user/";

will copy both files under the specified GridFTP URI.

1.6  Write JDL

Write down the .jdl file.

   GridFTPTest.jdl
=======================
[
Executable = "job.sh";
StdOutput = "userout.log";
StdError = "usererr.log";
InputSandbox = {"job.sh","gsiftp://se-1.grid.seed/tmp/user/prime"};
OutputSandbox = {"userout.log", "usererr.log"};
OutputSandboxBaseDestURI = "gsiftp://se-1.grid.seed/tmp/user/";
]
=======================

1.7  Submit the job

[user@ui-2 ~]$ glite-wms-job-submit -a -o jobid GridFTPTest.jdl

Connecting to the service https://wms-4.grid.seed:7443/glite_wms_wmproxy_server

====================== .glite-wms-job-submit Success ======================

The job has been successfully submitted to the WMProxy
Your job identifier is:

https://wms-4.grid.seed:9000/ozbQfIEvwy1b1Gmk0_FDCA

The job identifier has been saved in the following file:
/home/user/jobid

===========================================================================

1.8  Check job status

[user@ui-2 ~]$ glite-wms-job-status -i jobid


*************************************************************
BOOKKEEPING INFORMATION:

Status info for the Job : https://wms-4.grid.seed:9000/ozbQfIEvwy1b1Gmk0_FDCA
Current Status:     Done (Success)
Logged Reason(s):
    -
    - Job terminated successfully
Exit code:          0
Status Reason:      Job terminated successfully
Destination:        ce-1.grid.seed:2119/jobmanager-lcgpbs-gridseed
Submitted:          Tue Sep 30 09:51:21 2008 CEST
*************************************************************

1.9  Get Output

If try to retrieve the output using the glite-wms-job-output

[user@ui-2 ~]$ glite-wms-job-output -i jobid -o . --dir  subdir

Connecting to the service https://10.10.0.9:7443/glite_wms_wmproxy_server

================================================================================  
                             JOB GET OUTPUT OUTCOME

No output files to be retrieved for the job:
https://wms-4.grid.seed:9000/ozbQfIEvwy1b1Gmk0_FDCA

================================================================================

This command will create a directory called subdir in the current directory and will download the output inside.

1.10  Get Output with GridFTP

Use the globus-url-copy command for get the desired output.

[user@ui-2 ~]$ globus-url-copy -vb gsiftp://se-1.grid.seed/tmp/user/userout.log \
file:/home/user/userout.log
Source: gsiftp://se-1.grid.seed/tmp/user/
Dest:   file:/home/user/
  userout.log

1.11  Display results

Here are the results of the job.

[user@ui-2 ~]$ cat userout.log
       1       2       3       5       7      11      13      17
      19      23      29      31      37      41      43      47
      53      59      61      67      71      73      79      83
      89      97     101     103     107     109     113     127
     131     137     139     149     151     157     163     167
     ...     ...     ...     ...     ...     ...     ...     ...

2.  Real Time Output Retrieval


Inspecting the job output in real time. The user can enable the job perusal by setting the attribute PerusalFileEnable to true in the job JDL. This makes the WN to upload at regular time intervals (defined by the PerusalTimeIntrval attribute and expressed in seconds), a copy of the output files specified using the glite-wms-job-perusal command.

For example

1. Create a simple bash script that writes out the hostname

#!/bin/sh
#
/bin/hostname

2. The JDL files should like this:

     PerusalTest.jdl
===========================================
[
Executable = "job.sh";
StdOutput = "stdout.log";
StdError = "stderr.log";
InputSandbox = {"job.sh"};
OutputSandbox = {"stdout.log","stderr.log"};
PerusalFileEnable = true;
PerusalTimeInterval = 15;
RetryCount = 0;
]
===========================================

3. Submit the job with glite-wms-job-submit. To enable the job perusal use the glite-wms-job-perusal command. The user may select which output to be inspected. using -f nameoffile.

[user@ui-2 ~]$ glite-wms-job-perusal --set -f stdout.log  -f stderr.log 
https://wms-4.grid.seed:9000/IGYXVHG6LvmmyV3oBp3B9g

Connecting to the service https://10.10.0.9:7443/glite_wms_wmproxy_server

===================== .glite-wms-job-perusal Success =====================

Files perusal has been successfully enabled for the job:
https://wms-4.grid.seed:9000/IGYXVHG6LvmmyV3oBp3B9g

==========================================================================

4.When the job starts, the user may inspect:

[user@ui-2 ~]$ glite-wms-job-perusal --get -f stdout.log -o . --dir subdir
https://wms-4.grid.seed:9000/IGYXVHG6LvmmyV3oBp3B9g

Connecting to the service https://10.10.0.9:7443/glite_wms_wmproxy_server

===================== .glite-wms-job-perusal Success =====================

The retrieved files have been successfully stored in:
/tmp/user_IGYXVHG6LvmmyV3oBp3B9g

==========================================================================

--------------------------------------------------------------------------
file 1/1: stdout.log-20080930141119_1-20080930141119_1
--------------------------------------------------------------------------

This command will create a directory called subdir in the current directory and will download the output inside

5. See the results

[user@ui-2 user_IGYXVHG6LvmmyV3oBp3B9g]$ cat stdout.log-20080930141119_1-20080930141119_1

ce-2wn1.grid.seed

3.  Advanced Job Types


3.1  Job Collection

One of the most useful functionalities of WMProxy is the ability to submit job collections, defined as a set of independent jobs.

Here is an example of what it means and how to do it.

The simplest way to submit a collection is to put the JDL files of all the jobs in the collection in a single directory, and use --collection <dirname>, where <dirname> is the name of the directory.

So:

[user@ui-2 ~]$ mkdir jdl

you mist put your JDL files in the jdl/ directory. Suppose that you have the following two jobs:

      job1.jdl
====================
[
Executable="/bin/hostname";
StdOutput="std.out";
StdError="std.err";
OutputSandbox={"std.out","std.err"};
]
====================

      job2.jdl
====================
[
Executable = "/bin/echo";
StdOutput = "std.out";
StdError = "std.err";
Arguments = "Hello Trieste!";
OutputSanbox = {"std.out","std.err"};
]
===================

Submit both jobs at the same time by doing

[user@ui-2 ~]$ glite-wms-job-submit -a -o jobid --collection jdl/

Connecting to the service
https://wms-4.grid.seed:7443/glite_wms_wmproxy_server

====================== .glite-wms-job-submit Success ======================

The job has been successfully submitted to the WMProxy
Your job identifier is:

https://wms-4.grid.seed:9000/hBkspJ9W34qZz5vQaSFGTQ

The job identifier has been saved in the following file:
/home/user/jobid

==========================================================================

The jobID returned refers to the collection itself. To know the status of the collection and of all the job belonging to it, it is enough to use glite-wms-job-status as for any other kind of job:

[user@ui-2 ~]$ glite-wms-job-status https://wms-4.grid.seed:9000/hBkspJ9W34qZz5vQaSFGTQ

=============================================================
                BOOKKEEPING INFORMATION:

Status info for the Job : https://wms-4.grid.seed:9000/hBkspJ9W34qZz5vQaSFGTQ
Current Status:     Waiting
Submitted:          Wed Oct  1 10:39:03 2008 CEST
=============================================================

- Nodes information for:
    Status info for the Job : https://wms-4.grid.seed:9000/07bXsPOCN7BryYRW94SK5g
    Current Status:     Scheduled
    Status Reason:      Job successfully submitted to Globus
    Destination:        ce-1.grid.seed:2119/jobmanager-lcgpbs-gridseed
    Submitted:          Wed Oct  1 10:39:03 2008 CEST
=============================================================

    Status info for the Job : https://wms-4.grid.seed:9000/6ryYhbM82bWPKIbOlAkTTw
    Current Status:     Scheduled
    Status Reason:      Job successfully submitted to Globus
    Destination:        ce-1.grid.seed:2119/jobmanager-lcgpbs-gridseed
    Submitted:          Wed Oct  1 10:39:03 2008 CEST
=============================================================

Note: executing the glite-wms-job-status for the collection is the only way to know the jobIDs of the job in the collection.

3.2  Advanced Collection

A more flexible way to define a job collection is shown in the following JDL file. Its structure includes a global set of attributes, which are included by all the sub-jobs, and a set of attributes for each sub-job, which supersede the global ones.

[
Type = "Collection";
VirtualOrganisation = "gridseed";
MyProxyServer = "mpyroxy.grid.seed";
InputSandbox = {"numbers"};
StdOutput = "std.out";
StdError = "std.err";
OutputSandbox = {"std.err", "std.err"};
DefaultNodeShallowRetryCoony = 5;

Nodes = {
        [
        Executable = "node1.sh";
        InputSandbox = {root.InputSandbox, "node1.sh"};
        StdOutput = "myoutput1.txt";
        StdError = "std.err";
        OutputSandbox = {"myoutput1.txt","std.err"};
        Requirements = other.GlueCEPolicyMaxWallClockTime > 10;
        ],
        [
        NodeName = "mysubjob";
        Executable = "node2.sh";
        InputSandbox = {root.InputSandbox,"node2.sh"};
        StdOutput = "myoutput2.txt";
        StdError = "std.err";
        OutputSandbox = {"myoutput2.txt", "std.err"};
        ]
        }
]

The file numbers is an executable, it prints out the first n numbers in the Fibonacci sequence, the relative executable (node1.sh and node2.sh) give a different paramter for the two jobs. For example the file node1.sh is like this:

     
#!/bin/sh
# node1.sh
#
chmod 755 numbers
./numbers 24
==================== 

And so:

a. Type = "Collection"; describes a collection.

b. the job belong to the gridseed VO.

c. the Myproxy server to use for proxy renewal is mpyroxy.grid.seed.

d. all the jobs in the collection have by default the binary "numbers" in their sandbox(shared input sandbox).

e. the default maximum number of shallow resubmission is 5.

f. the input sandbox of the first job (or node) has all the default files (root.InputSandbox), plus an additional file, node1.sh, as like as the second.

g. the first job must run on a CE allowing at least ten minutes of wall clock time.

h. the two jobs have names node0, mysubjob.

Submit the job, retrieve the output when jobs finished.

[user@ui-2 ~]$ glite-wms-job-output https://wms-4.grid.seed:9000/twBf0M2QmQPej0OCGGXdFw

================================================================================
                        JOB GET OUTPUT OUTCOME

Output sandbox files for the DAG/Collection :
https://wms-4.grid.seed:9000/twBf0M2QmQPej0OCGGXdFw
have been successfully retrieved and stored in the directory:
/tmp/user_twBf0M2QmQPej0OCGGXdFw

================================================================================


[user@ui-2 user_twBf0M2QmQPej0OCGGXdFw]$ ls -l
total 12
-rw-rw-r--  1 user user  354 Oct  1 14:06 ids_nodes.map
drwxr-xr-x  2 user user 4096 Oct  1 14:06 mysubjob
drwxr-xr-x  2 user user 4096 Oct  1 14:06 Node_0

4.  Parametric Jobs


A parametric job is a job collection where the jobs are identical but for a value of running parameter. It is described by a single JDL, where attribute values may contain the current value or the running parameter. An example of a JDL for a parametric job follows:

     [
        Type = "job";
        JobType = "parametric";
        Executable = "job.sh";
        StdInput = "input_PARAM_.txt";
        StdOutput = "output_PARAM_.txt";
        Parameters = 10;
        ParameterStart = 1;
        ParameterStep = 1;
        InputSandbox = {"job.sh", "input_PARAM_.txt"};
        OutputSandbox = "output_PARAM_.txt";
     ]

The submission of this job will produce 10 jobs as follows:

     [
        Type = "job";
        JobType = "normal";
        Executable = "job.sh";
        StdInput = "inputi.txt";
        StdOutput = "outputi.txt";
        InputSandbox = {"job.sh", "inputi.txt"};
        OutputSandbox = "outputi.txt";
     ]

i = 1, 2, ..., 10 

The JobType attribute is set as parametric. The special key _PARAM_ indicates the parametric attribute. This string is replaced with a proper value during the job submission. The attribute Parameters can be either a number, or a list of items(typically strings but not enclosed within doubel quotes): in the first case, the value repesent the maximum value ot the runninig parameter _PARAM_; in the second case, it is the list of the values the parameter must take.ParameterStart is the initial number of the running parameter and ParameterStep is the increment of the running parameter between consecutive jobs. Here is another example where the parameter attribute is a list of values:

        Parametric.jdl
================================
[
JobType = "Parametric";
Executable = "/bin/cat";
Arguments = "input_PARAM_.txt";
InputSandbox = "input_PARAM_.txt";
StdOutput = "myoutput_PARAM_.txt";
StdError = "myerror_PARAM_.txt";
Parameters = {EARTH,MARS,MOON};
OutputSandbox = {"myoutput_PARAM_.txt"};
]
===============================

Submission of the previous JDL produces a submission of 3 jobs with the following JDL:

     [
        Type = "job";
        JobType = "normal";
        Executable = "/bin/cat";
        StdInput = "inputvalue.txt";
        StdOutput = "myoutputvalue.txt";
        StdError = "myerrorvalue.txt";
        InputSandbox = "inputvalue.txt";
        OutputSandbox = {"myoutputvalue.txt","myerrorvalue.txt"};
     ]

value =  "EARTH", "MARS", "MOON"

So you must have the following files before submitting your job:

[user@ui-2 param]$ ls
inputEARTH.txt  inputMARS.txt  inputMOON.txt  Parametric.jdl

[user@ui-2 param]$ cat inputEARTH.txt
Testing of a parametric job.
Hello from Earth!

[user@ui-2 param]$ glite-wms-job-submit -a -o jobid Parametric.jdl

Connecting to the service https://wms-4.grid.seed:7443/glite_wms_wmproxy_server


====================== .glite-wms-job-submit Success ======================

The job has been successfully submitted to the WMProxy
Your job identifier is:

https://wms-4.grid.seed:9000/B5Ro6Bgl7AKm_VKmMYW9ug

The job identifier has been saved in the following file:
/home/user/param/jobid

==========================================================================


[user@ui-2 param]$ glite-wms-job-status -i jobid

*************************************************************
BOOKKEEPING INFORMATION:

Status info for the Job : https://wms-4.grid.seed:9000/B5Ro6Bgl7AKm_VKmMYW9ug
Current Status:     Done (Success)
Exit code:          0
Submitted:          Tue Oct  7 14:50:08 2008 CEST
*************************************************************

- Nodes information for:
    Status info for the Job : https://wms-4.grid.seed:9000/3vsUAk4ND3HEImjRsTizFg
    Current Status:     Done (Success)
    Logged Reason(s):
        -
        - Job terminated successfully
    Exit code:          0
    Status Reason:      Job terminated successfully
    Destination:        ce-1.grid.seed:2119/jobmanager-lcgpbs-gridseed
    Submitted:          Tue Oct  7 14:50:08 2008 CEST
*************************************************************

    Status info for the Job : https://wms-4.grid.seed:9000/CxsZV2RZtb87eShH7-eaBA
    Current Status:     Done (Success)
    Logged Reason(s):
        -
        - Job terminated successfully
    Exit code:          0
    Status Reason:      Job terminated successfully
    Destination:        ce-1.grid.seed:2119/jobmanager-lcgpbs-gridseed
    Submitted:          Tue Oct  7 14:50:08 2008 CEST
*************************************************************

    Status info for the Job : https://wms-4.grid.seed:9000/ZI_vXveak1Zo-URjY_zjtQ
    Current Status:     Done (Success)
    Logged Reason(s):
        -
        - Job terminated successfully
    Exit code:          0
    Status Reason:      Job terminated successfully
    Destination:        ce-1.grid.seed:2119/jobmanager-lcgpbs-gridseed
    Submitted:          Tue Oct  7 14:50:08 2008 CEST
*************************************************************

Please retrieve the output using glite-wms-job-output

So we have:

[user@ui-2 user_B5Ro6Bgl7AKm_VKmMYW9ug]$ ls
ids_nodes.map  Node_EARTH  Node_MARS  Node_MOON

[user@ui-2 Node_EARTH]$ ls
myoutputEARTH.txt

[user@ui-2 Node_EARTH]$ cat myoutputEARTH.txt
Testing of a parametric job.
Hello from Eaeth!

5.  Using MyProxy Server


In this section we submit a job by using proxies from both VOMS server and MyProxy server:

  • Create the following c code with your favorite editor or do as follows:
[user@ui-1 ]$ cat >simple.c
#include <stdio.h>
main(int argc, char **argv)
{
    int sleep_time;
    int input;
    int failure;

    if (argc != 3) {
        printf("Usage: simple <sleep-time> <integer>n");
        failure = 1;
    } else {
        sleep_time = atoi(argv[1]);
        input      = atoi(argv[2]);

        printf("Thinking really hard for %d seconds...n", sleep_time);
        sleep(sleep_time);
        printf("We calculated: %dn", input * 2);
        failure = 0;
    }
    return failure;

}
Ctrl-D
  • Compile the simple.c with gcc and name it to simple.exe
[user@ui-1 ]gcc simple.c -o simple.exe
  • To see the result of the program just execute it with argument 4 and 10 :
[user@ui-1 ]./simple.exe 4 10
Thinking really hard for 4 seconds...
We calculated: 20

As you can see from the code this program lasts as long as the first argument in seconds and the second argument is dummy.

  • Make proper submit job script file :
[user@ui-1 ]cat >simple.jdl

Executable = "simple.exe";
StdOutput = "simple.out";
StdError = "simple.err";
InputSandbox = {"simple.exe"};
OutputSandbox = {"simple.out","simple.err"};
Arguments = "1300  10";  #1300> 1200sec = 20 min

Ctrl-D
  • First using voms proxy with limited validation as 20 minutes and see the statutes of your voms proxy using voms-proxy-info :
[user@ui-1 ] voms-proxy-init -voms gridseed -valid 0:20
Cannot find file or dir: /home/user/.glite/vomses
Enter GRID pass phrase:
Your identity: /O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol
Creating temporary proxy ....................................................................... Done
Contacting  voms.grid.seed:15000
[/O=GRIDBOX/DC=box/DC=grid/OU=Host Certificate/CN=voms.grid.seed] "gridseed" Done
Creating proxy ................................... Done
Your proxy is valid until Thu Oct 30 00:03:50 2008

[user@ui-1 ] voms-proxy-info 
subject   : /O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol/CN=proxy
issuer    : /O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol
identity  : /O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol
type      : proxy
strength  : 512 bits
path      : /tmp/x509up_u544
timeleft  : 0:19:40


  • Submit the the simple.jdl to grid :
[user@ui-1 ]  glite-wms-job-submit -a -o jobid simple.jdl

Connecting to the service https://wms-4.grid.seed:7443/glite_wms_wmproxy_server


====================== glite-wms-job-submit Success ======================

The job has been successfully submitted to the WMProxy
Your job identifier is:

https://wms-4.grid.seed:9000/gTG11_cnnBkOdn5dcoXs-g

The job identifier has been saved in the following file:
/home/user/myproxytest/jobid

==========================================================================
  • Get status of your submitted job and status of your voms proxy :
[user@ui-1 ]  glite-wms-job-status -i jobid


*************************************************************
BOOKKEEPING INFORMATION:

Status info for the Job : https://wms-4.grid.seed:9000/gTG11_cnnBkOdn5dcoXs-g
Current Status:     Running 
Status Reason:      Job successfully submitted to Globus
Destination:        ce-3.grid.seed:2119/jobmanager-lcgpbs-gridseed
Submitted:          Wed Oct 29 23:44:33 2008 CET
*************************************************************

[user@ui-1 ]voms-proxy-info 
subject   : /O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol/CN=proxy
issuer    : /O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol
identity  : /O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol
type      : proxy
strength  : 512 bits
path      : /tmp/x509up_u544
timeleft  : 0:14:11

  • Do the above step till the timeleft attribute of your voms proxy gets less than 10 mins and see the status of your job:
[user@ui-1 ]voms-proxy-info 
subject   : /O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol/CN=proxy
issuer    : /O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol
identity  : /O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol
type      : proxy
strength  : 512 bits
path      : /tmp/x509up_u544
timeleft  : 0:09:52

[user@ui-1 ]glite-wms-job-status -i jobid


*************************************************************
BOOKKEEPING INFORMATION:

Status info for the Job : https://wms-4.grid.seed:9000/gTG11_cnnBkOdn5dcoXs-g
Current Status:     Aborted 
Logged Reason(s):
    - File not available.Cannot read JobWrapper output, both from Condor and from Maradona.
    - Job has been terminated by the batch system
    - Got a job held event, reason: Globus error 131: the user proxy expired (job is still running)
    - Job got an error while in the CondorG queue.
Status Reason:      Job proxy is expired.
Destination:        ce-2.grid.seed:2119/jobmanager-lcgpbs-gridseed
Submitted:          Wed Oct 29 23:44:33 2008 CET
*************************************************************


Your job is aborted. using proxy from voms server has life-limited and never be extended. So you must reques a voms proxy with lifetime greater than runtime of your job. Now we use proxy of myproxy server:

  • Providing proxy from myproxy server :
[user@ui-1 ]export GT_PROXY_MODE=old
[user@ui-1 ] myproxy-init -n -d -t 1
Your identity: /O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol
Enter GRID pass phrase for this identity:
Creating proxy .............................................................. Done
Proxy Verify OK
Your proxy is valid until: Thu Nov  6 00:00:50 2008
A proxy valid for 168 hours (7.0 days) for user
/O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol now exists
on myproxy.grid.seed.
  • Geting a delegated proxy using myproxy server:
[user@ui-1 ] myproxy-get-delegation -d -o myproxy
Enter MyProxy pass phrase:
A credential has been received for user 
/O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol in myproxy.

We have now a proxy which is made by myproxy server named myproxy. In this step we need a voms proxy to submit job to WMS server, so we should add the attributes of the VO ( like gridseed )to voms proxy. Also we make a proxy with 20 min validation to compare with proxy of voms server. we can do as follows:

[user@ui-1 ] voms-proxy-init -cert myproxy -key myproxy -voms gridseed -valid 0:20
Cannot find file or dir: /home/user/.glite/vomses
Your identity: 
/O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol/CN=proxy/CN=proxy/CN=proxy
Creating temporary proxy ...................................... Done
Contacting  voms.grid.seed:15000 
[/O=GRIDBOX/DC=box/DC=grid/OU=Host Certificate/CN=voms.grid.seed] "gridseed" Done
Creating proxy ......................................................... Done
Your proxy is valid until Thu Oct 30 00:29:40 2008

[user@ui-1 ]  voms-proxy-info 
subject   : /O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol/CN=proxy/CN=proxy
/CN=proxy/CN=proxy
issuer    : /O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol/CN=proxy/CN=proxy
/CN=proxy
identity  : /O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol/CN=proxy/CN=proxy
/CN=proxy
type      : unknown
strength  : 512 bits
path      : /tmp/x509up_u544
timeleft  : 0:18:37

  • Submit the same simple.jdl :
[user@ui-1 ] glite-wms-job-submit -a -o jobid simple.jdl

Connecting to the service https://wms-4.grid.seed:7443/glite_wms_wmproxy_server


====================== glite-wms-job-submit Success ======================

The job has been successfully submitted to the WMProxy
Your job identifier is:

https://wms-4.grid.seed:9000/ejp23or4csZQVK5kykmuoA

The job identifier has been saved in the following file:
/home/user/myproxytest/jobid

==========================================================================

  • Do the step 8 such the timeleft of voms proxy get less than 10 min and see the status of jobid:
 [user@ui-1 ]glite-wms-job-status -i jobid

------------------------------------------------------------------
1 : https://wms-4.grid.seed:9000/gTG11_cnnBkOdn5dcoXs-g
2 : https://wms-4.grid.seed:9000/ejp23or4csZQVK5kykmuoA
a : all
q : quit
------------------------------------------------------------------

Choose one or more jobId(s) in the list - [1-2]all:a



*************************************************************
BOOKKEEPING INFORMATION:

Status info for the Job : https://wms-4.grid.seed:9000/gTG11_cnnBkOdn5dcoXs-g
Current Status:     Aborted 
Logged Reason(s):
    - File not available.Cannot read JobWrapper output, both from Condor and from Maradona.
    - Job has been terminated by the batch system
    - Got a job held event, reason: Globus error 131: the user proxy expired (job is still running)
    - Job got an error while in the CondorG queue.
Status Reason:      Job proxy is expired.
Destination:        ce-2.grid.seed:2119/jobmanager-lcgpbs-gridseed
Submitted:          Wed Oct 29 23:44:33 2008 CET
*************************************************************


*************************************************************
BOOKKEEPING INFORMATION:

Status info for the Job : https://wms-4.grid.seed:9000/ejp23or4csZQVK5kykmuoA
Current Status:     Running 
Status Reason:      Job successfully submitted to Globus
Destination:        ce-2.grid.seed:2119/jobmanager-lcgpbs-gridseed
Submitted:          Thu Oct 30 00:11:52 2008 CET
*************************************************************

[user@ui-1 ] voms-proxy-info 
subject   : /O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol/CN=proxy/CN=proxy
/CN=proxy/CN=proxy
issuer    : /O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol/CN=proxy/CN=proxy
/CN=proxy
identity  : /O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol/CN=proxy/CN=proxy
/CN=proxy
type      : unknown
strength  : 512 bits
path      : /tmp/x509up_u544
timeleft  : 0:15:06


[user@ui-1] voms-proxy-info 
subject   : /O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol/CN=proxy/CN=proxy
/CN=proxy/CN=proxy
issuer    : /O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol/CN=proxy/CN=proxy
/CN=proxy
identity  : /O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol/CN=proxy/CN=proxy
/CN=proxy
type      : unknown
strength  : 512 bits
path      : /tmp/x509up_u544
timeleft  : 0:09:42
[user@ui-1]$ glite-wms-job-status -i jobid

------------------------------------------------------------------
1 : https://wms-4.grid.seed:9000/gTG11_cnnBkOdn5dcoXs-g
2 : https://wms-4.grid.seed:9000/ejp23or4csZQVK5kykmuoA
a : all
q : quit
------------------------------------------------------------------

Choose one or more jobId(s) in the list - [1-2]all:a



*************************************************************
BOOKKEEPING INFORMATION:

Status info for the Job : https://wms-4.grid.seed:9000/gTG11_cnnBkOdn5dcoXs-g
Current Status:     Aborted 
Logged Reason(s):
    - File not available.Cannot read JobWrapper output, both from Condor and from Maradona.
    - Job has been terminated by the batch system
    - Got a job held event, reason: Globus error 131: the user proxy expired (job is still running)
    - Job got an error while in the CondorG queue.
Status Reason:      Job proxy is expired.
Destination:        ce-2.grid.seed:2119/jobmanager-lcgpbs-gridseed
Submitted:          Wed Oct 29 23:44:33 2008 CET
*************************************************************


*************************************************************
BOOKKEEPING INFORMATION:

Status info for the Job : https://wms-4.grid.seed:9000/ejp23or4csZQVK5kykmuoA
Current Status:     Running 
Status Reason:      Job successfully submitted to Globus
Destination:        ce-2.grid.seed:2119/jobmanager-lcgpbs-gridseed
Submitted:          Thu Oct 30 00:11:52 2008 CET
*************************************************************

You can see while the lifetime variable of voms proxy is less than 10 min still the job running. because myproxy server extend vitality of the proxy as long as needed for ruing job on grid system

  • Do the above step till the timeleft of voms proxy gets zero :
[user@ui-1]$ voms-proxy-info 
subject   : /O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol/CN=proxy/CN=proxy
/CN=proxy/CN=proxy
issuer    : /O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol/CN=proxy/CN=proxy
/CN=proxy
identity  : /O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol/CN=proxy/CN=proxy
/CN=proxy
type      : unknown
strength  : 512 bits
path      : /tmp/x509up_u544
timeleft  : 0:00:00
[user@ui-1 myproxytest]$ glite-wms-job-status -i jobid

------------------------------------------------------------------
1 : https://wms-4.grid.seed:9000/gTG11_cnnBkOdn5dcoXs-g
2 : https://wms-4.grid.seed:9000/ejp23or4csZQVK5kykmuoA
a : all
q : quit
------------------------------------------------------------------

Choose one or more jobId(s) in the list - [1-2]all:a


**** Error: UI_PROXY_EXPIRED ****  
Proxy certificate validity expired.

  • You can not get the status of your job . why ? does it mean the job is killed ? try for new voms proxy and get job status again:
[user@ui-1 ]voms-proxy-init -cert myproxy -key myproxy -voms gridseed -valid 0:20
Cannot find file or dir: /home/user/.glite/vomses
Your identity: 
/O=GRIDBOX/DC=box/DC=grid/OU=Personal Certificate/CN=user arabgol/CN=proxy/CN=proxy/CN=proxy
Creating temporary proxy .................................... Done
Contacting  voms.grid.seed:15000 
[/O=GRIDBOX/DC=box/DC=grid/OU=Host Certificate/CN=voms.grid.seed] "gridseed" Done
Creating proxy .................................................................. Done
Your proxy is valid until Thu Oct 30 00:52:00 2008


[user@ui-1 ]$ glite-wms-job-status -i jobid

------------------------------------------------------------------
1 : https://wms-4.grid.seed:9000/gTG11_cnnBkOdn5dcoXs-g
2 : https://wms-4.grid.seed:9000/ejp23or4csZQVK5kykmuoA
a : all
q : quit
------------------------------------------------------------------

Choose one or more jobId(s) in the list - [1-2]all:a



*************************************************************
BOOKKEEPING INFORMATION:

Status info for the Job : https://wms-4.grid.seed:9000/gTG11_cnnBkOdn5dcoXs-g
Current Status:     Aborted 
Logged Reason(s):
    - File not available.Cannot read JobWrapper output, both from Condor and from Maradona.
    - Job has been terminated by the batch system
    - Got a job held event, reason: Globus error 131: the user proxy expired (job is still running)
    - Job got an error while in the CondorG queue.
Status Reason:      Job proxy is expired.
Destination:        ce-2.grid.seed:2119/jobmanager-lcgpbs-gridseed
Submitted:          Wed Oct 29 23:44:33 2008 CET
*************************************************************


*************************************************************
BOOKKEEPING INFORMATION:

Status info for the Job : https://wms-4.grid.seed:9000/ejp23or4csZQVK5kykmuoA
Current Status:     Running 
Status Reason:      Job successfully submitted to Globus
Destination:        ce-2.grid.seed:2119/jobmanager-lcgpbs-gridseed
Submitted:          Thu Oct 30 00:11:52 2008 CET
*************************************************************

As you see your job still running and not killed , what do you infer from this ?