From: Greg Ercolano <erco@(email surpressed)>
Subject: [Q+A] What's the best way to parse output from the rush command line?
   Date: Wed, 09 Aug 2006 15:38:20 -0400

Msg# 1371
View Complete Thread (1 article) | All Threads
Last Next

> We're writing scripts that invoke rush commands to query information
> from our job server. Its clear how to parse the reports, but one concern
> is our handling error messages properly. What's the 'correct' way to handle
> error messages from rush commands?

    True, the details of parsing eg. error messages is not openly documented.

    The examples on the HOW-TO page and in the rush example submit scripts
    do show recommended techniques. But they don't come out and say what
    the rules are.

    I'll try to add a section on that in the 'How-To' docs.

    Meanwhile, I'll outline some info here..

    There is a 'standard' rush internally follows for reporting error messages.
    For instance, IRUSH and WWW-RUSH make use of this, so that
    error messages don't get split into the columns of the report
    fields, which would mess up sorting. Also, the rush example
    submit scripts make use of this to report error messages properly.

    The edict is:

       --------------------------------------------------------------------

       o All error messages reported by rush(1)
         lead off with either "rush: [text..]" or "*** [text..]".

         Examples:

               rush: 'rush -lj': tahoe: No route to host

               *** NO RESPONSE FROM:
               ***     alturas burke caddo crystal eagle
               ***     elk elwell gull rico sardis shasta

       o All rush commands report proper exit codes; '0' means it worked,
         and non-zero means it failed.

       o Most error messages are printed to stderr, though some warnings
         are printed to stdout.

       o All regular report output should be to stdout.

       --------------------------------------------------------------------

    So when parsing rush(1) output from eg. perl, the way to detect
    an error message is:

	if ( /^\*\*\*/ || /^rush:/ ) { ..an error message.. }

    When parsing rush output, if you want to be sure to get all error
    messages, combine stdout and stderr. In perl and bash, you would use
    "2>&1" to do this. (see below)

PARSING REPORTS
---------------
    All the rush report commands are similar in how they should be parsed.
    The general technique would be:

        open(RUSH, "rush -lj 2>&1 |");
        while (<RUSH>)
        {
            if ( /^\*\*\*/ || /^rush:/ ) { ..error message.. }
            else                         { ..parse report.. }
        }
        close(RUSH);

    An actual example showing how to parse the 'rush -laj':

------- snip
#!/usr/bin/perl -w
use strict;

# OPEN RUSH JOBS REPORT
#     > Set TCP timeout to 5 seconds (default is 45)
#     > Be sure to get both stdout and stderr (2>&1)
#

my $errmsg = "";                # errors, if any
my @runjobs;                    # jobs that are 'running'
my @donejobs;                   # jobs that are 'done'

unless ( open(RUSH, "rush -laj -s 8 2>&1 |") )
    { print STDERR "open(rush -laj..) failed: $!\n"; exit(1); }
while (<RUSH>)
{
    # HANDLE ERROR MESSAGES
    if ( /^\*\*\*/ || /^rush:/ )        # check for error messages
        { $errmsg .= $_; }              # save errors for later
    else
    {
	# HANDLE REPORT OUTPUT
        if ( /^Run/  ) { push(@runjobs,  $_); next; }
        if ( /^Done/ ) { push(@donejobs, $_); next; }
    }
}
close(RUSH);

# PRINT RESULTS
print "RUN JOBS:\n";  foreach ( @runjobs  ) { print "$_"; } print "\n";
print "DONE JOBS:\n"; foreach ( @donejobs ) { print "$_"; } print "\n";

# PRINT ERRORS (IF ANY)
if ( $errmsg ne "" )
    { print "ERRORS:\n$errmsg\n"; }

------- snip

	This general parsing approach should work for all rush reports.

SUBMITTING JOBS
---------------
	When submitting jobs, the best way to detect a successful submission
	is to look for a line that contains RUSH_JOBID followed by a character,
	followed by a jobid. In perl regex, that would be /RUSH_JOBID.(\S+)/
	where the '.' matches any character, and the (\S+) is the jobid.

        When submitting a job, there are sometimes "warning messages" which
	will be printed, even though the job submits anyway. In other cases,
	an error will be printed and the job won't be started.

	To differentiate, the simple rule is that if 'rush -submit' printed a line
	that has the RUSH_JOBID line in it, the job submitted and is running.

	Here's one way to approach submitting a job, and checking for warnings
	and errors correctly. This should work on any platform (windows, unix..):

------- snip
#!/usr/bin/perl -w
use strict;

# CREATE A SUBMIT FILE
#     Just some 'do nothing' commands for this example..
#
my $tmpfile = ".submit-$$.txt";
unless(open(TMPFILE, ">$tmpfile")) { print STDERR "$tmpfile: $!\n"; exit(1); }
print TMPFILE << "EOF";
frames   1-10
command  sleep 30
cpus     +any=10
EOF
close(TMPFILE);

# SUBMIT JOB, CHECKING FOR ERRORS
my $jobid = "";     # will contain jobid of submitted job (if submit successful)
my $errmsg = "";    # will contain errors or warning messages, if any
unless ( open(SUBMIT, "rush -submit < $tmpfile 2>&1 |") )
    { print STDERR "open(rush -submit..) failed: $!\n"; exit(1); }
while (<SUBMIT>)
{
    # HANDLE ERRORS / WARNINGS
    if ( /^\*\*\*/ || /^rush:/ )        # check for error messages
        { $errmsg .= $_; }              # save errors for later
    else
    {
        # HANDLE ALL OTHER OUTPUT
        if ( /RUSH_JOBID.(\S+)/ )       # job submitted OK, this is the jobid
             { $jobid = $1; }
    }
}
close(SUBMIT);

# REMOVE TMP FILE
#    We no longer need it..
#
unless (unlink($tmpfile))
    { print STDERR "unlink($tmpfile): $!\n"; exit(1); }

# ANY ERROR MESSAGES?
if ( $errmsg ne "" )
{
    if ( $jobid eq "" )
        # NO JOBID INDICATES FAILED TO SUBMIT
        { print "FAILED TO SUBMIT:\n$errmsg\n"; }
    else
        # JOB SUBMITTED, BUT THERE WERE WARNINGS
        { print "WARNINGS:\n$errmsg\n"; }
}

# JOB NOT STARTED IF WE DIDN'T GET THE JOBID
if ( $jobid eq "" )
    { print "JOB DID NOT SUBMIT\n"; exit(1); }

print "JOBID IS $jobid\n";
exit(0);

------- snip