From: Greg Ercolano <erco@(email surpressed)> Subject: [Q+A] What's the best way to parse output from the rush command line? Date: Wed, 09 Aug 2006 15:38:20 -0400 |
Msg# 1371 View Complete Thread (1 article) | All Threads Last Next |
> We're writing scripts that invoke rush commands to query information > from our job server. Its clear how to parse the reports, but one concern > is our handling error messages properly. What's the 'correct' way to handle > error messages from rush commands? True, the details of parsing eg. error messages is not openly documented. The examples on the HOW-TO page and in the rush example submit scripts do show recommended techniques. But they don't come out and say what the rules are. I'll try to add a section on that in the 'How-To' docs. Meanwhile, I'll outline some info here.. There is a 'standard' rush internally follows for reporting error messages. For instance, IRUSH and WWW-RUSH make use of this, so that error messages don't get split into the columns of the report fields, which would mess up sorting. Also, the rush example submit scripts make use of this to report error messages properly. The edict is: -------------------------------------------------------------------- o All error messages reported by rush(1) lead off with either "rush: [text..]" or "*** [text..]". Examples: rush: 'rush -lj': tahoe: No route to host *** NO RESPONSE FROM: *** alturas burke caddo crystal eagle *** elk elwell gull rico sardis shasta o All rush commands report proper exit codes; '0' means it worked, and non-zero means it failed. o Most error messages are printed to stderr, though some warnings are printed to stdout. o All regular report output should be to stdout. -------------------------------------------------------------------- So when parsing rush(1) output from eg. perl, the way to detect an error message is: if ( /^\*\*\*/ || /^rush:/ ) { ..an error message.. } When parsing rush output, if you want to be sure to get all error messages, combine stdout and stderr. In perl and bash, you would use "2>&1" to do this. (see below) PARSING REPORTS --------------- All the rush report commands are similar in how they should be parsed. The general technique would be: open(RUSH, "rush -lj 2>&1 |"); while (<RUSH>) { if ( /^\*\*\*/ || /^rush:/ ) { ..error message.. } else { ..parse report.. } } close(RUSH); An actual example showing how to parse the 'rush -laj': ------- snip #!/usr/bin/perl -w use strict; # OPEN RUSH JOBS REPORT # > Set TCP timeout to 5 seconds (default is 45) # > Be sure to get both stdout and stderr (2>&1) # my $errmsg = ""; # errors, if any my @runjobs; # jobs that are 'running' my @donejobs; # jobs that are 'done' unless ( open(RUSH, "rush -laj -s 8 2>&1 |") ) { print STDERR "open(rush -laj..) failed: $!\n"; exit(1); } while (<RUSH>) { # HANDLE ERROR MESSAGES if ( /^\*\*\*/ || /^rush:/ ) # check for error messages { $errmsg .= $_; } # save errors for later else { # HANDLE REPORT OUTPUT if ( /^Run/ ) { push(@runjobs, $_); next; } if ( /^Done/ ) { push(@donejobs, $_); next; } } } close(RUSH); # PRINT RESULTS print "RUN JOBS:\n"; foreach ( @runjobs ) { print "$_"; } print "\n"; print "DONE JOBS:\n"; foreach ( @donejobs ) { print "$_"; } print "\n"; # PRINT ERRORS (IF ANY) if ( $errmsg ne "" ) { print "ERRORS:\n$errmsg\n"; } ------- snip This general parsing approach should work for all rush reports. SUBMITTING JOBS --------------- When submitting jobs, the best way to detect a successful submission is to look for a line that contains RUSH_JOBID followed by a character, followed by a jobid. In perl regex, that would be /RUSH_JOBID.(\S+)/ where the '.' matches any character, and the (\S+) is the jobid. When submitting a job, there are sometimes "warning messages" which will be printed, even though the job submits anyway. In other cases, an error will be printed and the job won't be started. To differentiate, the simple rule is that if 'rush -submit' printed a line that has the RUSH_JOBID line in it, the job submitted and is running. Here's one way to approach submitting a job, and checking for warnings and errors correctly. This should work on any platform (windows, unix..): ------- snip #!/usr/bin/perl -w use strict; # CREATE A SUBMIT FILE # Just some 'do nothing' commands for this example.. # my $tmpfile = ".submit-$$.txt"; unless(open(TMPFILE, ">$tmpfile")) { print STDERR "$tmpfile: $!\n"; exit(1); } print TMPFILE << "EOF"; frames 1-10 command sleep 30 cpus +any=10 EOF close(TMPFILE); # SUBMIT JOB, CHECKING FOR ERRORS my $jobid = ""; # will contain jobid of submitted job (if submit successful) my $errmsg = ""; # will contain errors or warning messages, if any unless ( open(SUBMIT, "rush -submit < $tmpfile 2>&1 |") ) { print STDERR "open(rush -submit..) failed: $!\n"; exit(1); } while (<SUBMIT>) { # HANDLE ERRORS / WARNINGS if ( /^\*\*\*/ || /^rush:/ ) # check for error messages { $errmsg .= $_; } # save errors for later else { # HANDLE ALL OTHER OUTPUT if ( /RUSH_JOBID.(\S+)/ ) # job submitted OK, this is the jobid { $jobid = $1; } } } close(SUBMIT); # REMOVE TMP FILE # We no longer need it.. # unless (unlink($tmpfile)) { print STDERR "unlink($tmpfile): $!\n"; exit(1); } # ANY ERROR MESSAGES? if ( $errmsg ne "" ) { if ( $jobid eq "" ) # NO JOBID INDICATES FAILED TO SUBMIT { print "FAILED TO SUBMIT:\n$errmsg\n"; } else # JOB SUBMITTED, BUT THERE WERE WARNINGS { print "WARNINGS:\n$errmsg\n"; } } # JOB NOT STARTED IF WE DIDN'T GET THE JOBID if ( $jobid eq "" ) { print "JOB DID NOT SUBMIT\n"; exit(1); } print "JOBID IS $jobid\n"; exit(0); ------- snip |