From: "Abraham Schneider" <aschneider@(email surpressed)>
Subject: several strange Rush behaviours
   Date: Thu, 19 Jul 2012 04:51:07 -0400
Msg# 2257
View Complete Thread (5 articles) | All Threads
Last Next
Hi there!

Wanted to ask about two strange behaviours that occur from time to time on our rush renderfarm and I don't have any plausible explanation for. Our farm is a mixed farm of Macs and Linux with Rush 102.42a9c/d installed, rendering Nuke 6.3v8.

1. problem:
Very unregulary and random we have a situation like this:

Done     rind10.12202     N_NORM_019_060_comp_v04_as aschneid        %100    %0    0   16:52:52
Done     rind10.12204     N_NORM_002_010_comp_v16_dl dlaubsch        %100    %0    0   16:32:02
Done     rind10.12206     N_LOW_048_060_redLog_v00a_mw mwarlimont      %100    %0    0   16:28:48
Done     rind10.12213     N_NORM_001_050_comp_v38_ts tstern          %100    %0    0   16:06:51
Done     rind10.12215     N_LOW_048_080_redLog_v00a_mw mwarlimont      %100    %0    0   16:05:03
Done     rind10.12218     N_NORM_019_060_comp_v04_as aschneid        %100    %0    0   15:08:02
Fail     rind10.12221     N_NORM_001_010_comp_v01_st stischne         %99    %1    0   00:58:31
Done     rind10.12223     N_NORM_001_010_comp_v01_st stischne        %100    %0    0   00:55:04
Run      rind10.12225     N_NORM_001_050_comp_v38_ts tstern           %81    %0    2   00:41:35
Run      rind10.12226     N_NORM_055_010_comp_v01_mt mwarlimont        %0    %0    0   00:36:58
Run      rind10.12227     N_NORM_100_020_comp_v21_mt mwarlimont        %0    %0    0   00:34:11
Done     rind10.12228     N_NORM_001_010_comp_v01_st stischne        %100    %0    0   00:23:08
Run      rind10.12230     N_NORM_103_010_comp_v14_mt mwarlimont       %36    %0   17   00:19:49
Run      rind10.12231     N_NORM_022_cfd0046_comp_v102 ppoetsch          %6    %0    0   00:19:37

Rush is configured to work "first in - first out". And all these jobs were submitted from inside of Nuke via a slightly modified submit_nuke.pl script, all with the same priorities '+nuke=42@500', no difference in submitting at all, as far as I can see. Nothing changed on the farm, no machines added or removed, switched on/offline, etc.

Most of the time, all works just fine and the jobs are rendered one after the other in order of the submitting time/job ID. But sometimes something like above happens: job 12225 starts rendering on all online machines. But halfway through the rendering, it just stops or the amount of CPUs drops significantly and all the other machines continue rendering on a much newer job (in this case job 12228), skipping the unfinished frames from job 12225 and the next submitted jobs 12226 and 12227. 12228 was rendered completely and instead of returning to 12225/12226/12227, most of the machines (except for one machine with two CPUs, that keeps rendering 12225) continued with 12230. I tried to pause 12230 while most of the machines were rendering it. Result was that the machines continued with 12231.

Is there any reason and/or solution, why Rush doesn't follow the 'first in/first out' randomly from time to time? It's hard to debug this problem because I haven't found a way to reproduce this behaviour.



2. problem:
Most of the time, switching a machine/workstation from offline to online, it takes from many seconds to several minutes for this machine to pick up a frame and start rendering. The machine is shown as 'online' instantly, but it just won't start rendering a frame. It's listed as 'online' and 'idle' for several minutes. This happens for all of our machines, doesn't matter if they are Macs or Linux.

Any explanation for that?


Thanks, Abraham


Abraham Schneider
Senior VFX Compositor
 

ARRI Film & TV Services GmbH
Tuerkenstr. 89
D-80799 Muenchen / Germany

Phone (Tel# suppressed) 

EMail aschneider@(email surpressed)
www.arri.de/filmtv
________________________________


ARRI Film & TV Services GmbH
Sitz: München Registergericht: Amtsgericht München
Handelsregisternummer: HRB 69396
Geschäftsführer: Franz Kraus, Dr. Martin Prillmann, Josef Reidinger

   From: Greg Ercolano <erco@(email surpressed)>
Subject: Re: several strange Rush behaviours
   Date: Thu, 19 Jul 2012 12:58:08 -0400
Msg# 2258
View Complete Thread (5 articles) | All Threads
Last Next
On 07/19/12 01:51, Abraham Schneider wrote:
> Done     rind10.12202     N_NORM_019_060_comp_v04_as aschneid        %100    %0    0   16:52:52
> Done     rind10.12204     N_NORM_002_010_comp_v16_dl dlaubsch        %100    %0    0   16:32:02
> Done     rind10.12206     N_LOW_048_060_redLog_v00a_mw mwarlimont    %100    %0    0   16:28:48
> Done     rind10.12213     N_NORM_001_050_comp_v38_ts tstern          %100    %0    0   16:06:51
> Done     rind10.12215     N_LOW_048_080_redLog_v00a_mw mwarlimont    %100    %0    0   16:05:03
> Done     rind10.12218     N_NORM_019_060_comp_v04_as aschneid        %100    %0    0   15:08:02
> Fail     rind10.12221     N_NORM_001_010_comp_v01_st stischne         %99    %1    0   00:58:31
> Done     rind10.12223     N_NORM_001_010_comp_v01_st stischne        %100    %0    0   00:55:04
> Run      rind10.12225     N_NORM_001_050_comp_v38_ts tstern           %81    %0    2   00:41:35
> Run      rind10.12226     N_NORM_055_010_comp_v01_mt mwarlimont        %0    %0    0   00:36:58
> Run      rind10.12227     N_NORM_100_020_comp_v21_mt mwarlimont        %0    %0    0   00:34:11
> Done     rind10.12228     N_NORM_001_010_comp_v01_st stischne        %100    %0    0   00:23:08
> Run      rind10.12230     N_NORM_103_010_comp_v14_mt mwarlimont       %36    %0   17   00:19:49
> Run      rind10.12231     N_NORM_022_cfd0046_comp_v102 ppoetsch        %6    %0    0   00:19:37
> 
> ..sometimes something like above happens: job 12225 starts rendering on all online
> machines. But halfway through the rendering, it just stops or the amount
> of CPUs drops significantly and all the other machines continue
> rendering on a much newer job

Hi Abraham,

        Wow, you have large jobids! You must have the jobidmax value
	cranked up. Be careful with that (see below).

	Can I see these reports for the rind10.12225/6/7 jobs? eg:

		rush -lf rind10.12225 rind10.12226 rind10.12227
		rush -lc rind10.12225 rind10.12226 rind10.12227

	The '26 and '27 jobs appear to be getting completely skipped over,
	they stand out to me.

	All the rest seem like they could be OK, need to see the reports
	to know more.

	The 12225 job doesn't worry me too much, as it's 81% done with
	2 busy frames, so if those are the last two frames in the job,
	that would make sense. But if there are still available frames
	in the Que state with a TRY count of zero, that would be puzzling.

	As you probably know, if a job is rendering its last few frames,
	newly idle cpus will go to the next jobs down. If someone requeues
	all the frames in one of the higher up jobs, then that could bring
	them back down to 0% done, and they'd have to wait for available procs.
	I'll be able to tell from the 'Frames' report; the TRY column will show
	if a frame has already been run before.

> 2. problem:
> Most of the time, switching a machine/workstation from offline to =
> online, it takes from many seconds to several minutes for this machine =
> to pick up a frame and start rendering. The machine is shown as 'online' =
> instantly, but it just won't start rendering a frame. It's listed as =
> 'online' and 'idle' for several minutes. This happens for all of our =
> machines, doesn't matter if they are Macs or Linux.

	Can you send me the tasklist for the machine in question, ie:

		rush -tasklist SLOWHOST

	..I want to see how large that report is. If it's really large,
	that might be the reason.

	That report will show the list of jobs it is considering
	to give the idle cpus in the order it wants to check.

	One situation might be if there's a bunch of jobs at the
	top of its list that are being managed by a machine that
	is currently down. In that case rush will try to contact
	that machine to get the job started, and will keep trying
	until a timeout of about a minute or so, then it will give up
	and move to the next jobs in the list that are not on that
	unresponsive machine.

	Another possibility is if machines reboot to new IP addresses
	(eg. DHCP assigned machines), that might cause rushd to not
	be able to reach job servers to establish jobs, causing the
	above situation.

	It might be good if you send me the rushd.log from machines
	that act this way; I might be able to tell from that if
	there's a problem.

> Any explanation for that?

    Those large jobids might be the culprit, not sure.

    When you have jobidmax set high, this can mean thousands of jobs can remain
    in the queue, causing the system to work extra hard to find jobs that are
    available.

    The large max should be OK /as long/ as the 'Jobs' reports are kept trim.
    ie. dump old jobs. You don't want to leave old jobs in the queue; they take
    up memory and make the daemon work harder internally to consider those
    jobs in case they've been requeued.

    And if you have several job servers each with very large queues, that
    would exacerbate the problem.

    The reason rush comes with 999 as the max for jobids is to force
    folks to dump old jobs so that the queue doesn't get artificially large.

-- 
Greg Ercolano, erco@(email surpressed)
Seriss Corporation
Rush Render Queue, http://seriss.com/rush/
Tel: (Tel# suppressed)ext.23
Fax: (Tel# suppressed)
Cel: (Tel# suppressed)


   From: "Abraham Schneider" <aschneider@(email surpressed)>
Subject: Re: several strange Rush behaviours
   Date: Fri, 20 Jul 2012 06:31:23 -0400
Msg# 2259
View Complete Thread (5 articles) | All Threads
Last Next
Am 19.07.2012 um 18:58 schrieb Greg Ercolano:

> [posted to rush.general]
>
> On 07/19/12 01:51, Abraham Schneider wrote:
>> Done     rind10.12202     N_NORM_019_060_comp_v04_as aschneid        %100    %0    0   16:52:52
>> Done     rind10.12204     N_NORM_002_010_comp_v16_dl dlaubsch        %100    %0    0   16:32:02
>> Done     rind10.12206     N_LOW_048_060_redLog_v00a_mw mwarlimont    %100    %0    0   16:28:48
>> Done     rind10.12213     N_NORM_001_050_comp_v38_ts tstern          %100    %0    0   16:06:51
>> Done     rind10.12215     N_LOW_048_080_redLog_v00a_mw mwarlimont    %100    %0    0   16:05:03
>> Done     rind10.12218     N_NORM_019_060_comp_v04_as aschneid        %100    %0    0   15:08:02
>> Fail     rind10.12221     N_NORM_001_010_comp_v01_st stischne         %99    %1    0   00:58:31
>> Done     rind10.12223     N_NORM_001_010_comp_v01_st stischne        %100    %0    0   00:55:04
>> Run      rind10.12225     N_NORM_001_050_comp_v38_ts tstern           %81    %0    2   00:41:35
>> Run      rind10.12226     N_NORM_055_010_comp_v01_mt mwarlimont        %0    %0    0   00:36:58
>> Run      rind10.12227     N_NORM_100_020_comp_v21_mt mwarlimont        %0    %0    0   00:34:11
>> Done     rind10.12228     N_NORM_001_010_comp_v01_st stischne        %100    %0    0   00:23:08
>> Run      rind10.12230     N_NORM_103_010_comp_v14_mt mwarlimont       %36    %0   17   00:19:49
>> Run      rind10.12231     N_NORM_022_cfd0046_comp_v102 ppoetsch        %6    %0    0   00:19:37
>>
>> ..sometimes something like above happens: job 12225 starts rendering on all online
>> machines. But halfway through the rendering, it just stops or the amount
>> of CPUs drops significantly and all the other machines continue
>> rendering on a much newer job
>
> Hi Abraham,
>
>        Wow, you have large jobids! You must have the jobidmax value
>       cranked up. Be careful with that (see below).

Don't worry, it's just the number that I cranked up. Normaly there aren't more than 200-300 jobs in rush. We dump them regularly. I think I'm using these high numbers because when there are jobs on the farm and the maximum number is reached, new jobs will start with 0 again and these jobs will be rendered next, despite of older jobs waiting in the queue. FIFO is based on jobID I think!? So I wanted to avoid to have this switch from max jobID to 0 too often during day time.

>
>       Can I see these reports for the rind10.12225/6/7 jobs? eg:
>
>               rush -lf rind10.12225 rind10.12226 rind10.12227
>               rush -lc rind10.12225 rind10.12226 rind10.12227

see separate mail to greg@(email surpressed)
>
>       The '26 and '27 jobs appear to be getting completely skipped over,
>       they stand out to me.
>
>       All the rest seem like they could be OK, need to see the reports
>       to know more.
>
>       The 12225 job doesn't worry me too much, as it's 81% done with
>       2 busy frames, so if those are the last two frames in the job,
>       that would make sense. But if there are still available frames
>       in the Que state with a TRY count of zero, that would be puzzling.
>
>       As you probably know, if a job is rendering its last few frames,
>       newly idle cpus will go to the next jobs down. If someone requeues
>       all the frames in one of the higher up jobs, then that could bring
>       them back down to 0% done, and they'd have to wait for available procs.
>       I'll be able to tell from the 'Frames' report; the TRY column will show
>       if a frame has already been run before.

Of course I know that. As you can see in the report, 81% done was not at the end of the shot, as the shot is quite long. You see the problem part where only the machine 'sunrender' is rendering frames.

>
>> 2. problem:
>> Most of the time, switching a machine/workstation from offline to =
>> online, it takes from many seconds to several minutes for this machine =
>> to pick up a frame and start rendering. The machine is shown as 'online' =
>> instantly, but it just won't start rendering a frame. It's listed as =
>> 'online' and 'idle' for several minutes. This happens for all of our =
>> machines, doesn't matter if they are Macs or Linux.
>
>       Can you send me the tasklist for the machine in question, ie:
>
>               rush -tasklist SLOWHOST

see separate mail
>
>       ..I want to see how large that report is. If it's really large,
>       that might be the reason.
>
>       That report will show the list of jobs it is considering
>       to give the idle cpus in the order it wants to check.
>
>       One situation might be if there's a bunch of jobs at the
>       top of its list that are being managed by a machine that
>       is currently down. In that case rush will try to contact
>       that machine to get the job started, and will keep trying
>       until a timeout of about a minute or so, then it will give up
>       and move to the next jobs in the list that are not on that
>       unresponsive machine.
>
>       Another possibility is if machines reboot to new IP addresses
>       (eg. DHCP assigned machines), that might cause rushd to not
>       be able to reach job servers to establish jobs, causing the
>       above situation.
>
>       It might be good if you send me the rushd.log from machines
>       that act this way; I might be able to tell from that if
>       there's a problem.
>
seems not to be a problem there, the rushd.log of the problem machine 'apu' is quite small:

today:
07/20,03:00:27 ROTATE     rushd.log rotated. pid=146, 0/3 busy, OFFLINE
07/20,03:00:27 ROTATE     apu RUSHD 102.42a9d PID=146     Boot=07/17/12,19:07:25

yesterday:
07/19,03:00:16 ROTATE     rushd.log rotated. pid=146, 0/3 busy, OFFLINE
07/19,03:00:16 ROTATE     apu RUSHD 102.42a9d PID=146     Boot=07/17/12,19:07:25
07/19,18:23:56 SECURITY   Daemon changed to Online by aschneid@itchy[192.168.10.21], Remark:online by aschneid 07/19/12,18:23 via irush state (online) by aschneid@itchy[192.168.10.21]
07/19,19:02:03 SECURITY   Daemon changed to Getoff by aschneid@itchy[192.168.10.21], Remark:getoff by aschneid 07/19/12,19:02 via irush state (getoff) by aschneid@itchy[192.168.10.21]
07/20,03:00:27 ROTATE     rushd.log rotated by rush.conf: logrotatehour=3


>> Any explanation for that?
>
>    Those large jobids might be the culprit, not sure.
>
>    When you have jobidmax set high, this can mean thousands of jobs can remain
>    in the queue, causing the system to work extra hard to find jobs that are
>    available.
>
>    The large max should be OK /as long/ as the 'Jobs' reports are kept trim.
>    ie. dump old jobs. You don't want to leave old jobs in the queue; they take
>    up memory and make the daemon work harder internally to consider those
>    jobs in case they've been requeued.
>
>    And if you have several job servers each with very large queues, that
>    would exacerbate the problem.
>
>    The reason rush comes with 999 as the max for jobids is to force
>    folks to dump old jobs so that the queue doesn't get artificially large.

As I said above, we try to dump as often as possible. We use only one submit host and I get:

rush -lj rind10 | wc
     266    2130   25912

So there are about 260 jobs in our queue


Hope that helps.

Abraham



Abraham Schneider
Senior VFX Compositor
 

ARRI Film & TV Services GmbH
Tuerkenstr. 89
D-80799 Muenchen / Germany

Phone (Tel# suppressed) 

EMail aschneider@(email surpressed)
www.arri.de/filmtv
________________________________


ARRI Film & TV Services GmbH
Sitz: München Registergericht: Amtsgericht München
Handelsregisternummer: HRB 69396
Geschäftsführer: Franz Kraus, Dr. Martin Prillmann, Josef Reidinger

   From: Greg Ercolano <erco@(email surpressed)>
Subject: Re: several strange Rush behaviours
   Date: Fri, 20 Jul 2012 12:54:45 -0400
Msg# 2260
View Complete Thread (5 articles) | All Threads
Last Next
On 07/20/12 03:03, Abraham Schneider wrote:
> As I can't send attachements to the mailinglist (right?), here are the two
> reports that you requested [via private email]

	Right, I wouldn't normally advise attachments on the mailing list,
	though you can paste clear text which is what I've done below with
	your reports, and following up to the group here.

	For the reports to appear correctly, I always have wordwrap turned off
	in my mail program so that it doesn't break the formatting during send.

> here are the two reports that you requested in the mailinglist conversation
> 'several strange Rush behaviours'.

	Great.

	So looking at just the Frames reports for those three jobs you sent me,
	they look all OK, actually.

	So there doesn't seem to be a problem in those after all..
	Sorry, guess I should have asked for the other job reports as well.

	Can you send me (via private email so I can reformat the reports
	for posting) the same -lf/-lc reports for the other 12228,30,31 jobs?
	(if they were dumped, you can instead send the framelist/jobinfo
	files from those job's logdir)

	Anyway, looking at the three jobs you sent..
	
* * *

	Focusing on just the 'JOBID' and 'START' columns, they appear to
	all be OK:

	JOBID 12225's frame START times go incrementally from 9:39 ~ 10:41.
	JOBID 12226's frame START times go incrementally from 10:41 ~ 10:43.
	JOBID 12227's frame START times go incrementally from 10:43 ~ 10:47.

	This is what I'd expect to see, the smaller jobids run first,
	the older jobs run after, with a small bit of dove tailing at the end.
	So the FIFO looks normal here.

	The first job has render times that start around a minute, but grow
	towards the end to 10 or 20 mins (eg. frames 117 and 121 being >20m).

	The render times of the last two jobs are much shorter than the first,
	rarely going above a minute, and never more than 2 mins,
	so the first job will still be rendering those long frames
	way after the older jobs have started and finished.

	All the TRY counts are 1, showing frames weren't retrying or requeued,
	they all were straight runs, which makes it a good example.

	Here's the report you sent me showing the three jobs, for the sake
	of the newsgroup:

STAT FRAME   TRY HOSTNAME        PID     JOBID            START          ELAPSED  NOTES
Done 0000    1   rind7           9320    rind10.12225     07/19,09:39:57 00:01:00
Done 0001    1   rind8           22894   rind10.12225     07/19,09:39:57 00:00:59
Done 0002    1   rind9           17769   rind10.12225     07/19,09:39:57 00:01:00
Done 0003    1   rind35          23440   rind10.12225     07/19,09:39:57 00:01:15
Done 0004    1   rind36          25612   rind10.12225     07/19,09:39:57 00:01:15
Done 0005    1   rind37          24266   rind10.12225     07/19,09:39:57 00:01:15
Done 0006    1   rind38          18216   rind10.12225     07/19,09:39:57 00:01:14
Done 0007    1   sunrender       3418    rind10.12225     07/19,09:39:57 00:00:47
Done 0008    1   burnl11         25895   rind10.12225     07/19,09:39:57 00:01:00
Done 0009    1   burnl12         18190   rind10.12225     07/19,09:39:57 00:01:02
Done 0010    1   burnl13         26707   rind10.12225     07/19,09:39:57 00:00:30
Done 0011    1   burnl16         18219   rind10.12225     07/19,09:39:57 00:00:59
Done 0012    1   burnl17         18177   rind10.12225     07/19,09:39:57 00:01:02
Done 0013    1   bumblebee       13909   rind10.12225     07/19,09:39:57 00:00:28
Done 0014    1   rind39          23650   rind10.12225     07/19,09:39:57 00:00:32
Done 0015    1   burnl15         19281   rind10.12225     07/19,09:39:57 00:00:58
Done 0016    1   sunrender       3424    rind10.12225     07/19,09:39:58 00:00:46
Done 0017    1   rind39          23655   rind10.12225     07/19,09:39:58 00:00:31
Done 0018    1   rind39          23657   rind10.12225     07/19,09:39:58 00:00:31
Done 0019    1   rind39          23659   rind10.12225     07/19,09:39:58 00:00:31
Done 0020    1   bumblebee       13922   rind10.12225     07/19,09:40:26 00:00:27
Done 0021    1   burnl13         26782   rind10.12225     07/19,09:40:28 00:00:29
Done 0022    1   rind39          23730   rind10.12225     07/19,09:40:30 00:00:31
Done 0023    1   rind39          23732   rind10.12225     07/19,09:40:30 00:00:31
Done 0024    1   rind39          23734   rind10.12225     07/19,09:40:30 00:00:31
Done 0025    1   rind39          23736   rind10.12225     07/19,09:40:30 00:00:29
Done 0026    1   sunrender       3634    rind10.12225     07/19,09:40:45 00:00:42
Done 0027    1   sunrender       3636    rind10.12225     07/19,09:40:45 00:00:42
Done 0028    1   bumblebee       13930   rind10.12225     07/19,09:40:55 00:00:27
Done 0029    1   burnl15         19404   rind10.12225     07/19,09:40:56 00:01:03
Done 0030    1   rind8           23137   rind10.12225     07/19,09:40:57 00:01:00
Done 0031    1   burnl16         18341   rind10.12225     07/19,09:40:57 00:01:00
Done 0032    1   rind7           9502    rind10.12225     07/19,09:40:59 00:01:01
Done 0033    1   rind9           17981   rind10.12225     07/19,09:40:59 00:01:00
Done 0034    1   burnl13         26850   rind10.12225     07/19,09:40:59 00:00:28
Done 0035    1   burnl11         26036   rind10.12225     07/19,09:40:59 00:01:01
Done 0036    1   burnl12         18312   rind10.12225     07/19,09:41:00 00:00:59
Done 0037    1   burnl17         18316   rind10.12225     07/19,09:41:00 00:00:58
Done 0038    1   rind39          23812   rind10.12225     07/19,09:41:01 00:00:29
Done 0039    1   rind39          23827   rind10.12225     07/19,09:41:02 00:00:28
Done 0040    1   rind39          23829   rind10.12225     07/19,09:41:02 00:00:28
Done 0041    1   rind39          23831   rind10.12225     07/19,09:41:02 00:00:28
Done 0042    1   rind38          18394   rind10.12225     07/19,09:41:12 00:01:11
Done 0043    1   rind36          25817   rind10.12225     07/19,09:41:13 00:01:10
Done 0044    1   rind35          23656   rind10.12225     07/19,09:41:13 00:01:10
Done 0045    1   rind37          24444   rind10.12225     07/19,09:41:13 00:01:11
Done 0046    1   bumblebee       13939   rind10.12225     07/19,09:41:23 00:00:24
Done 0047    1   sunrender       3824    rind10.12225     07/19,09:41:28 00:00:53
Done 0048    1   sunrender       3826    rind10.12225     07/19,09:41:28 00:00:53
Done 0049    1   burnl13         26920   rind10.12225     07/19,09:41:28 00:00:35
Done 0050    1   rind39          23892   rind10.12225     07/19,09:41:31 00:00:36
Done 0051    1   rind39          23894   rind10.12225     07/19,09:41:31 00:00:36
Done 0052    1   rind39          23896   rind10.12225     07/19,09:41:31 00:00:36
Done 0053    1   rind39          23898   rind10.12225     07/19,09:41:31 00:00:36
Done 0054    1   bumblebee       13946   rind10.12225     07/19,09:41:49 00:00:33
Done 0055    1   rind8           23401   rind10.12225     07/19,09:41:58 00:01:08
Done 0056    1   burnl17         18430   rind10.12225     07/19,09:41:59 00:01:11
Done 0057    1   burnl16         18475   rind10.12225     07/19,09:41:59 00:01:10
Done 0058    1   rind9           18069   rind10.12225     07/19,09:42:00 00:01:10
Done 0059    1   burnl12         18430   rind10.12225     07/19,09:42:00 00:01:10
Done 0060    1   burnl15         19522   rind10.12225     07/19,09:42:00 00:01:08
Done 0061    1   burnl11         26154   rind10.12225     07/19,09:42:00 00:01:54
Done 0062    1   rind7           9589    rind10.12225     07/19,09:42:02 00:02:06
Done 0063    1   burnl13         27004   rind10.12225     07/19,09:42:04 00:01:09
Done 0064    1   rind39          23972   rind10.12225     07/19,09:42:08 00:01:22
Done 0065    1   rind39          23974   rind10.12225     07/19,09:42:08 00:01:16
Done 0066    1   rind39          23976   rind10.12225     07/19,09:42:08 00:01:18
Done 0067    1   rind39          23978   rind10.12225     07/19,09:42:08 00:01:18
Done 0068    1   sunrender       3986    rind10.12225     07/19,09:42:22 00:01:44
Done 0069    1   sunrender       3989    rind10.12225     07/19,09:42:22 00:01:38
Done 0070    1   bumblebee       13955   rind10.12225     07/19,09:42:23 00:01:11
Done 0071    1   rind36          25916   rind10.12225     07/19,09:42:24 00:03:04
Done 0072    1   rind35          23755   rind10.12225     07/19,09:42:24 00:02:57
Done 0073    1   rind38          18491   rind10.12225     07/19,09:42:24 00:02:57
Done 0074    1   rind37          24543   rind10.12225     07/19,09:42:25 00:02:49
Done 0075    1   rind8           23620   rind10.12225     07/19,09:43:07 00:02:11
Done 0076    1   burnl15         19656   rind10.12225     07/19,09:43:10 00:02:21
Done 0077    1   burnl16         18594   rind10.12225     07/19,09:43:10 00:02:22
Done 0078    1   rind9           18167   rind10.12225     07/19,09:43:11 00:02:02
Done 0079    1   burnl17         18553   rind10.12225     07/19,09:43:11 00:02:02
Done 0080    1   burnl12         18564   rind10.12225     07/19,09:43:12 00:02:14
Done 0081    1   burnl13         27138   rind10.12225     07/19,09:43:15 00:01:07
Done 0082    1   rind39          24053   rind10.12225     07/19,09:43:26 00:01:12
Done 0083    1   rind39          24068   rind10.12225     07/19,09:43:28 00:01:10
Done 0084    1   rind39          24070   rind10.12225     07/19,09:43:28 00:01:08
Done 0085    1   rind39          24099   rind10.12225     07/19,09:43:31 00:01:10
Done 0086    1   bumblebee       13965   rind10.12225     07/19,09:43:36 00:01:09
Done 0087    1   burnl11         26352   rind10.12225     07/19,09:43:56 00:02:09
Done 0088    1   sunrender       4302    rind10.12225     07/19,09:44:01 00:01:39
Done 0089    1   sunrender       4408    rind10.12225     07/19,09:44:07 00:01:43
Done 0090    1   rind7           9755    rind10.12225     07/19,09:44:10 00:02:09
Done 0091    1   burnl13         27272   rind10.12225     07/19,09:44:23 00:01:06
Done 0092    1   rind39          24135   rind10.12225     07/19,09:44:38 00:01:37
Done 0093    1   rind39          24142   rind10.12225     07/19,09:44:39 00:01:33
Done 0094    1   rind39          24144   rind10.12225     07/19,09:44:39 00:01:33
Done 0095    1   rind39          24181   rind10.12225     07/19,09:44:42 00:01:33
Done 0096    1   bumblebee       13975   rind10.12225     07/19,09:44:47 00:01:15
Done 0097    1   rind9           18323   rind10.12225     07/19,09:45:15 00:04:20
Done 0098    1   rind37          24782   rind10.12225     07/19,09:45:15 00:06:20
Done 0099    1   burnl17         18784   rind10.12225     07/19,09:45:16 00:02:20
Done 0100    1   rind8           23788   rind10.12225     07/19,09:45:19 00:04:15
Done 0101    1   rind35          24008   rind10.12225     07/19,09:45:23 00:07:24
Done 0102    1   rind38          18704   rind10.12225     07/19,09:45:23 00:07:30
Done 0103    1   burnl12         18796   rind10.12225     07/19,09:45:27 00:02:30
Done 0104    1   rind36          26140   rind10.12225     07/19,09:45:29 00:07:03
Done 0105    1   burnl13         27393   rind10.12225     07/19,09:45:30 00:01:25
Done 0106    1   burnl15         19908   rind10.12225     07/19,09:45:32 00:02:59
Done 0107    1   burnl16         18858   rind10.12225     07/19,09:45:34 00:07:43
Done 0108    1   sunrender       4555    rind10.12225     07/19,09:45:42 00:02:15
Done 0109    1   sunrender       4675    rind10.12225     07/19,09:45:51 00:02:14
Done 0110    1   bumblebee       13985   rind10.12225     07/19,09:46:03 00:01:19
Done 0111    1   burnl11         26592   rind10.12225     07/19,09:46:07 00:09:21
Done 0112    1   rind39          24218   rind10.12225     07/19,09:46:13 00:02:02
Done 0113    1   rind39          24220   rind10.12225     07/19,09:46:13 00:02:02
Done 0114    1   rind39          24250   rind10.12225     07/19,09:46:17 00:02:01
Done 0115    1   rind39          24252   rind10.12225     07/19,09:46:17 00:02:04
Done 0116    1   rind7           9952    rind10.12225     07/19,09:46:20 00:04:23
Done 0117    1   burnl13         27549   rind10.12225     07/19,09:46:56 00:21:02
Done 0118    1   bumblebee       13995   rind10.12225     07/19,09:47:24 00:01:38
Done 0119    1   burnl17         19028   rind10.12225     07/19,09:47:37 00:19:09
Done 0120    1   sunrender       4811    rind10.12225     07/19,09:47:58 00:02:57
Done 0121    1   burnl12         19056   rind10.12225     07/19,09:47:59 00:24:16
Done 0122    1   sunrender       4935    rind10.12225     07/19,09:48:06 00:03:00
Done 0123    1   rind39          24303   rind10.12225     07/19,09:48:16 00:02:14
Done 0124    1   rind39          24305   rind10.12225     07/19,09:48:16 00:02:16
Done 0125    1   rind39          24335   rind10.12225     07/19,09:48:21 00:02:11
Done 0126    1   rind39          24350   rind10.12225     07/19,09:48:22 00:02:05
Done 0127    1   burnl15         20201   rind10.12225     07/19,09:48:33 00:16:05
Done 0128    1   bumblebee       14005   rind10.12225     07/19,09:49:04 00:01:30
Done 0129    1   rind8           24107   rind10.12225     07/19,09:49:36 00:04:26
Done 0130    1   rind9           18644   rind10.12225     07/19,09:49:36 00:04:29
Done 0131    1   rind39          24388   rind10.12225     07/19,09:50:28 00:01:54
Done 0132    1   rind39          24406   rind10.12225     07/19,09:50:32 00:01:52
Done 0133    1   rind39          24421   rind10.12225     07/19,09:50:33 00:01:51
Done 0134    1   rind39          24423   rind10.12225     07/19,09:50:33 00:01:49
Done 0135    1   bumblebee       14015   rind10.12225     07/19,09:50:36 00:01:26
Done 0136    1   rind7           10414   rind10.12225     07/19,09:50:44 00:03:13
Done 0137    1   sunrender       5169    rind10.12225     07/19,09:50:56 00:02:18
Done 0138    1   sunrender       5293    rind10.12225     07/19,09:51:07 00:02:28
Done 0139    1   rind37          25324   rind10.12225     07/19,09:51:37 00:08:10
Done 0140    1   bumblebee       14026   rind10.12225     07/19,09:52:03 00:01:27
Done 0141    1   rind39          24475   rind10.12225     07/19,09:52:24 00:01:58
Done 0142    1   rind39          24477   rind10.12225     07/19,09:52:24 00:01:58
Done 0143    1   rind39          24505   rind10.12225     07/19,09:52:25 00:01:57
Done 0144    1   rind39          24507   rind10.12225     07/19,09:52:25 00:01:55
Done 0145    1   rind36          26644   rind10.12225     07/19,09:52:33 00:07:09
Done 0146    1   rind35          24672   rind10.12225     07/19,09:52:48 00:07:28
Done 0147    1   rind38          19286   rind10.12225     07/19,09:52:56 00:06:44
Done 0148    1   sunrender       5423    rind10.12225     07/19,09:53:15 00:02:18
Done 0149    1   burnl16         19527   rind10.12225     07/19,09:53:18 00:03:41
Done 0150    1   bumblebee       14036   rind10.12225     07/19,09:53:32 00:01:16
Done 0151    1   sunrender       5547    rind10.12225     07/19,09:53:36 00:01:51
Done 0152    1   rind7           10815   rind10.12225     07/19,09:53:58 00:02:52
Done 0153    1   rind8           24443   rind10.12225     07/19,09:54:04 00:02:45
Done 0154    1   rind9           18997   rind10.12225     07/19,09:54:07 00:02:49
Done 0155    1   rind39          24557   rind10.12225     07/19,09:54:22 00:01:32
Done 0156    1   rind39          24571   rind10.12225     07/19,09:54:23 00:01:24
Done 0157    1   rind39          24573   rind10.12225     07/19,09:54:23 00:01:24
Done 0158    1   rind39          24575   rind10.12225     07/19,09:54:23 00:01:30
Done 0159    1   bumblebee       14046   rind10.12225     07/19,09:54:50 00:01:16
Done 0160    1   sunrender       5677    rind10.12225     07/19,09:55:28 00:01:57
Done 0161    1   burnl11         27380   rind10.12225     07/19,09:55:30 00:05:09
Done 0162    1   sunrender       5797    rind10.12225     07/19,09:55:34 00:01:57
Done 0163    1   rind39          24637   rind10.12225     07/19,09:55:50 00:01:26
Done 0164    1   rind39          24639   rind10.12225     07/19,09:55:50 00:01:27
Done 0165    1   rind39          24670   rind10.12225     07/19,09:55:55 00:01:28
Done 0166    1   rind39          24675   rind10.12225     07/19,09:55:55 00:01:22
Done 0167    1   bumblebee       14056   rind10.12225     07/19,09:56:06 00:01:12
Done 0168    1   rind8           24667   rind10.12225     07/19,09:56:51 00:02:51
Done 0169    1   rind7           11030   rind10.12225     07/19,09:56:52 00:02:47
Done 0170    1   rind9           19388   rind10.12225     07/19,09:56:57 00:02:49
Done 0171    1   burnl16         19868   rind10.12225     07/19,09:57:01 00:01:57
Done 0172    1   rind39          24723   rind10.12225     07/19,09:57:17 00:01:16
Done 0173    1   rind39          24738   rind10.12225     07/19,09:57:19 00:01:17
Done 0174    1   rind39          24740   rind10.12225     07/19,09:57:19 00:01:17
Done 0175    1   bumblebee       14066   rind10.12225     07/19,09:57:20 00:01:04
Done 0176    1   rind39          24781   rind10.12225     07/19,09:57:24 00:01:17
Done 0177    1   sunrender       5927    rind10.12225     07/19,09:57:26 00:01:44
Done 0178    1   sunrender       6009    rind10.12225     07/19,09:57:32 00:01:41
Done 0179    1   sunrender       6181    rind10.12225     07/19,09:59:11 00:01:43
Done 0180    1   sunrender       6188    rind10.12225     07/19,09:59:14 00:01:54
Done 0181    1   burnl16         20196   rind10.12225     07/19,09:59:52 00:02:29
Done 0182    1   rind8           25118   rind10.12225     07/19,09:59:54 00:02:38
Done 0183    1   rind39          25711   rind10.12225     07/19,09:59:54 00:01:27
Done 0184    1   rind39          25713   rind10.12225     07/19,09:59:54 00:01:25
Done 0185    1   rind39          25715   rind10.12225     07/19,09:59:54 00:01:23
Done 0186    1   rind39          25717   rind10.12225     07/19,09:59:54 00:01:23
Done 0187    1   bumblebee       14177   rind10.12225     07/19,09:59:54 00:01:09
Done 0188    1   rind9           19635   rind10.12225     07/19,09:59:57 00:02:44
Done 0189    1   rind7           11307   rind10.12225     07/19,09:59:58 00:02:41
Done 0190    1   rind37          26204   rind10.12225     07/19,10:00:14 00:03:44
Done 0191    1   rind36          27209   rind10.12225     07/19,10:00:15 00:03:41
Done 0192    1   rind35          25221   rind10.12225     07/19,10:00:20 00:04:06
Done 0193    1   rind38          20144   rind10.12225     07/19,10:00:33 00:03:57
Done 0194    1   burnl11         27546   rind10.12225     07/19,10:00:41 00:02:12
Done 0195    1   sunrender       6389    rind10.12225     07/19,10:00:55 00:01:48
Done 0196    1   bumblebee       14187   rind10.12225     07/19,10:01:06 00:01:07
Done 0197    1   sunrender       6521    rind10.12225     07/19,10:01:09 00:01:53
Done 0198    1   rind39          25791   rind10.12225     07/19,10:01:17 00:01:26
Done 0199    1   rind39          25793   rind10.12225     07/19,10:01:17 00:01:14
Done 0200    1   rind39          25823   rind10.12225     07/19,10:01:21 00:01:15
Done 0201    1   rind39          25838   rind10.12225     07/19,10:01:22 00:01:20
Done 0202    1   sunrender       6647    rind10.12225     07/19,10:02:46 00:01:47
Done 0203    1   sunrender       6773    rind10.12225     07/19,10:03:03 00:01:49
Done 0204    1   sunrender       6900    rind10.12225     07/19,10:04:34 00:01:47
Done 0205    1   sunrender       7024    rind10.12225     07/19,10:04:55 00:01:49
Done 0206    1   sunrender       7150    rind10.12225     07/19,10:06:22 00:01:45
Done 0207    1   sunrender       7276    rind10.12225     07/19,10:06:45 00:01:45
Done 0208    1   sunrender       7405    rind10.12225     07/19,10:08:08 00:01:44
Done 0209    1   sunrender       7529    rind10.12225     07/19,10:08:31 00:01:44
Done 0210    1   sunrender       7655    rind10.12225     07/19,10:09:53 00:01:45
Done 0211    1   sunrender       7779    rind10.12225     07/19,10:10:17 00:01:49
Done 0212    1   sunrender       7907    rind10.12225     07/19,10:11:40 00:01:50
Done 0213    1   sunrender       8031    rind10.12225     07/19,10:12:07 00:01:39
Done 0214    1   sunrender       8157    rind10.12225     07/19,10:13:32 00:01:49
Done 0215    1   sunrender       8281    rind10.12225     07/19,10:13:47 00:01:40
Done 0216    1   sunrender       8410    rind10.12225     07/19,10:15:23 00:01:41
Done 0217    1   sunrender       8482    rind10.12225     07/19,10:15:28 00:01:48
Done 0218    1   sunrender       8660    rind10.12225     07/19,10:17:05 00:01:48
Done 0219    1   sunrender       8784    rind10.12225     07/19,10:17:18 00:01:40
Done 0220    1   sunrender       8912    rind10.12225     07/19,10:18:53 00:01:46
Done 0221    1   sunrender       8992    rind10.12225     07/19,10:19:00 00:01:47
Done 0222    1   sunrender       9164    rind10.12225     07/19,10:20:42 00:01:40
Done 0223    1   sunrender       9260    rind10.12225     07/19,10:20:48 00:01:46
Done 0224    1   sunrender       9407    rind10.12225     07/19,10:22:23 00:01:47
Done 0225    1   sunrender       9531    rind10.12225     07/19,10:22:36 00:01:45
Done 0226    1   sunrender       9659    rind10.12225     07/19,10:24:11 00:01:51
Done 0227    1   sunrender       9783    rind10.12225     07/19,10:24:22 00:01:41
Done 0228    1   sunrender       9911    rind10.12225     07/19,10:26:04 00:01:37
Done 0229    1   sunrender       9917    rind10.12225     07/19,10:26:05 00:01:46
Done 0230    1   sunrender       10106   rind10.12225     07/19,10:27:42 00:01:44
Done 0231    1   sunrender       10233   rind10.12225     07/19,10:27:54 00:01:49
Done 0232    1   sunrender       10360   rind10.12225     07/19,10:29:27 00:01:51
Done 0233    1   sunrender       10484   rind10.12225     07/19,10:29:44 00:01:48
Done 0234    1   sunrender       10610   rind10.12225     07/19,10:31:20 00:01:46
Done 0235    1   sunrender       10730   rind10.12225     07/19,10:31:33 00:01:43
Done 0236    1   sunrender       10862   rind10.12225     07/19,10:33:07 00:01:49
Done 0237    1   sunrender       10986   rind10.12225     07/19,10:33:18 00:01:42
Done 0238    1   sunrender       11113   rind10.12225     07/19,10:34:58 00:01:47
Done 0239    1   sunrender       11157   rind10.12225     07/19,10:35:02 00:01:47
Done 0240    1   sunrender       11365   rind10.12225     07/19,10:36:46 00:01:38
Done 0241    1   sunrender       11409   rind10.12225     07/19,10:36:50 00:01:46
Done 0242    1   sunrender       11617   rind10.12225     07/19,10:38:26 00:01:42
Done 0243    1   sunrender       11741   rind10.12225     07/19,10:38:38 00:01:40
Done 0244    1   bumblebee       14300   rind10.12225     07/19,10:38:49 00:01:06
Done 0245    1   burnl11         31341   rind10.12225     07/19,10:38:51 00:01:44
Done 0246    1   rind39          27317   rind10.12225     07/19,10:38:51 00:01:21
Done 0247    1   rind39          27319   rind10.12225     07/19,10:38:51 00:01:16
Done 0248    1   burnl13         32630   rind10.12225     07/19,10:38:51 00:01:04
Done 0249    1   burnl16         24359   rind10.12225     07/19,10:38:51 00:01:48
Done 0250    1   rind39          27347   rind10.12225     07/19,10:38:53 00:01:17
Done 0251    1   rind9           23519   rind10.12225     07/19,10:38:54 00:02:18
Done 0252    1   rind8           29131   rind10.12225     07/19,10:39:00 00:02:13
Done 0253    1   rind38          23542   rind10.12225     07/19,10:39:05 00:03:34
Done 0254    1   rind37          30614   rind10.12225     07/19,10:39:05 00:02:41
Done 0255    1   rind35          29497   rind10.12225     07/19,10:39:16 00:03:18
Done 0256    1   rind36          31594   rind10.12225     07/19,10:39:16 00:02:50
Done 0257    1   burnl12         24009   rind10.12225     07/19,10:39:46 00:01:52
Done 0258    1   burnl17         24113   rind10.12225     07/19,10:39:48 00:01:46
Done 0259    1   burnl15         25422   rind10.12225     07/19,10:39:49 00:01:45
Done 0260    1   bumblebee       14311   rind10.12225     07/19,10:39:58 00:01:05
Done 0261    1   burnl13         32764   rind10.12225     07/19,10:39:59 00:01:20
Done 0262    1   sunrender       11867   rind10.12225     07/19,10:40:09 00:01:44
Done 0263    1   rind39          27379   rind10.12225     07/19,10:40:12 00:01:06
Done 0264    1   rind39          27394   rind10.12225     07/19,10:40:13 00:01:20
Done 0265    1   rind39          27396   rind10.12225     07/19,10:40:13 00:01:10
Done 0266    1   sunrender       11987   rind10.12225     07/19,10:40:19 00:01:48
Done 0267    1   burnl11         31541   rind10.12225     07/19,10:40:40 00:01:42
Done 0268    1   burnl16         24562   rind10.12225     07/19,10:40:44 00:01:43
Done 0269    1   bumblebee       14322   rind10.12225     07/19,10:41:03 00:01:02
Done 0270    1   rind7           15750   rind10.12225     07/19,10:41:05 00:02:33
Done 0271    1   rind39          27439   rind10.12225     07/19,10:41:07 00:01:14
Done 0272    1   rind9           23712   rind10.12225     07/19,10:41:16 00:02:17
Done 0273    1   rind8           29307   rind10.12225     07/19,10:41:17 00:02:21
Done 0274    1   burnl13         458     rind10.12225     07/19,10:41:20 00:01:00

STAT FRAME   TRY HOSTNAME        PID     JOBID            START          ELAPSED  NOTES
Done 0000    1   rind39          27459   rind10.12226     07/19,10:41:21 00:01:03
Done 0001    1   rind39          27485   rind10.12226     07/19,10:41:27 00:00:57
Done 0002    1   rind39          27511   rind10.12226     07/19,10:41:36 00:00:47
Done 0003    1   burnl15         25606   rind10.12226     07/19,10:41:36 00:00:20
Done 0004    1   burnl17         24298   rind10.12226     07/19,10:41:39 00:00:19
Done 0005    1   burnl12         24209   rind10.12226     07/19,10:41:42 00:00:15
Done 0006    1   rind37          30899   rind10.12226     07/19,10:41:51 00:00:25
Done 0007    1   sunrender       12119   rind10.12226     07/19,10:41:56 00:01:51
Done 0008    1   burnl12         24267   rind10.12226     07/19,10:41:58 00:00:15
Done 0009    1   burnl15         25668   rind10.12226     07/19,10:41:58 00:00:15
Done 0010    1   burnl17         24354   rind10.12226     07/19,10:42:00 00:00:14
Done 0011    1   sunrender       12222   rind10.12226     07/19,10:42:10 00:01:17
Done 0012    1   rind36          32001   rind10.12226     07/19,10:42:10 00:00:20
Done 0013    1   bumblebee       14332   rind10.12226     07/19,10:42:10 00:00:10
Done 0014    1   burnl12         24309   rind10.12226     07/19,10:42:14 00:00:14
Done 0015    1   burnl15         25731   rind10.12226     07/19,10:42:15 00:00:15
Done 0016    1   burnl17         24413   rind10.12226     07/19,10:42:16 00:00:13
Done 0017    1   rind37          30977   rind10.12226     07/19,10:42:17 00:00:18
Done 0018    1   bumblebee       14344   rind10.12226     07/19,10:42:22 00:00:06
Done 0019    1   burnl13         576     rind10.12226     07/19,10:42:24 00:00:12
Done 0020    1   rind39          27542   rind10.12226     07/19,10:42:25 00:01:51
Done 0021    1   rind39          27544   rind10.12226     07/19,10:42:25 00:01:51
Done 0022    1   rind39          27546   rind10.12226     07/19,10:42:25 00:00:57
Done 0023    1   rind39          27548   rind10.12226     07/19,10:42:25 00:00:57
Done 0024    1   burnl11         31727   rind10.12226     07/19,10:42:27 00:00:14
Done 0025    1   burnl12         24366   rind10.12226     07/19,10:42:29 00:00:14
Done 0026    1   bumblebee       14351   rind10.12226     07/19,10:42:29 00:00:05
Done 0027    1   rind36          32194   rind10.12226     07/19,10:42:31 00:00:15
Done 0028    1   burnl15         25774   rind10.12226     07/19,10:42:31 00:00:13
Done 0029    1   burnl16         24748   rind10.12226     07/19,10:42:31 00:00:35
Done 0030    1   burnl17         24453   rind10.12226     07/19,10:42:31 00:00:14
Done 0031    1   rind37          31119   rind10.12226     07/19,10:42:37 00:00:16
Done 0032    1   bumblebee       14358   rind10.12226     07/19,10:42:37 00:00:05
Done 0033    1   burnl13         618     rind10.12226     07/19,10:42:37 00:00:12
Done 0034    1   rind35          29752   rind10.12226     07/19,10:42:38 00:00:21
Done 0035    1   rind38          23947   rind10.12226     07/19,10:42:43 00:00:24
Done 0036    1   burnl11         31790   rind10.12226     07/19,10:42:43 00:00:14
Done 0037    1   bumblebee       14365   rind10.12226     07/19,10:42:43 00:00:06
Done 0038    1   burnl12         24408   rind10.12226     07/19,10:42:45 00:00:15
Done 0039    1   burnl15         25829   rind10.12226     07/19,10:42:46 00:00:14
Done 0040    1   burnl17         24508   rind10.12226     07/19,10:42:47 00:00:15
Done 0041    1   rind36          32231   rind10.12226     07/19,10:42:49 00:00:18
Done 0042    1   burnl13         677     rind10.12226     07/19,10:42:51 00:00:12
Done 0043    1   bumblebee       14374   rind10.12226     07/19,10:42:51 00:00:06
Done 0044    1   rind37          31158   rind10.12226     07/19,10:42:54 00:00:18
Done 0045    1   bumblebee       14381   rind10.12226     07/19,10:42:58 00:00:06
Done 0046    1   burnl11         31849   rind10.12226     07/19,10:42:59 00:00:14
Done 0047    1   burnl12         24466   rind10.12226     07/19,10:43:01 00:00:14
Done 0048    1   rind35          29827   rind10.12226     07/19,10:43:01 00:00:17
Done 0049    1   burnl15         25869   rind10.12226     07/19,10:43:01 00:00:14
Done 0050    1   burnl17         24547   rind10.12226     07/19,10:43:03 00:00:14
Done 0051    1   burnl13         716     rind10.12226     07/19,10:43:03 00:00:09
Done 0052    1   bumblebee       14388   rind10.12226     07/19,10:43:05 00:00:06
Done 0053    1   rind36          32280   rind10.12226     07/19,10:43:09 00:00:17
Done 0054    1   rind38          24117   rind10.12226     07/19,10:43:09 00:00:17
Done 0055    1   burnl16         24836   rind10.12226     07/19,10:43:09 00:00:15
Done 0056    1   bumblebee       14395   rind10.12226     07/19,10:43:12 00:00:05
Done 0057    1   rind37          31206   rind10.12226     07/19,10:43:12 00:00:16
Done 0058    1   burnl13         760     rind10.12226     07/19,10:43:15 00:00:08
Done 0059    1   burnl11         31889   rind10.12226     07/19,10:43:15 00:00:14
Done 0060    1   burnl12         24505   rind10.12226     07/19,10:43:16 00:00:11
Done 0061    1   burnl15         25924   rind10.12226     07/19,10:43:16 00:00:14
Done 0062    1   burnl17         24602   rind10.12226     07/19,10:43:18 00:00:15
Done 0063    1   rind35          29948   rind10.12226     07/19,10:43:19 00:00:17
Done 0064    1   bumblebee       14402   rind10.12226     07/19,10:43:19 00:00:06
Done 0065    1   rind39          27638   rind10.12226     07/19,10:43:22 00:01:49
Done 0066    1   rind39          27640   rind10.12226     07/19,10:43:22 00:01:49
Done 0067    1   burnl13         799     rind10.12226     07/19,10:43:24 00:00:09
Done 0068    1   burnl16         24876   rind10.12226     07/19,10:43:25 00:00:15
Done 0069    1   rind38          24188   rind10.12226     07/19,10:43:27 00:00:17
Done 0070    1   bumblebee       14411   rind10.12226     07/19,10:43:27 00:00:06
Done 0071    1   rind36          32325   rind10.12226     07/19,10:43:27 00:00:18
Done 0072    1   sunrender       12330   rind10.12226     07/19,10:43:28 00:00:47
Done 0073    1   burnl12         24560   rind10.12226     07/19,10:43:28 00:00:15
Done 0074    1   burnl11         31947   rind10.12226     07/19,10:43:29 00:00:14
Done 0075    1   rind37          31255   rind10.12226     07/19,10:43:30 00:00:18
Done 0076    1   burnl15         25963   rind10.12226     07/19,10:43:31 00:00:14
Done 0077    1   burnl13         839     rind10.12226     07/19,10:43:34 00:00:09
Done 0078    1   burnl17         24667   rind10.12226     07/19,10:43:34 00:00:14
Done 0079    1   bumblebee       14418   rind10.12226     07/19,10:43:34 00:00:06
Done 0080    1   rind9           23913   rind10.12226     07/19,10:43:36 00:00:24
Done 0081    1   rind35          30070   rind10.12226     07/19,10:43:37 00:00:18
Done 0082    1   rind8           29710   rind10.12226     07/19,10:43:40 00:00:20
Done 0083    1   rind7           15975   rind10.12226     07/19,10:43:42 00:00:21
Done 0084    1   burnl16         24932   rind10.12226     07/19,10:43:42 00:00:14
Done 0085    1   bumblebee       14425   rind10.12226     07/19,10:43:42 00:00:06
Done 0086    1   burnl12         24600   rind10.12226     07/19,10:43:43 00:00:14
Done 0087    1   burnl13         879     rind10.12226     07/19,10:43:45 00:00:09
Done 0088    1   burnl11         31986   rind10.12226     07/19,10:43:45 00:00:14
Done 0089    1   rind38          24313   rind10.12226     07/19,10:43:46 00:00:18
Done 0090    1   burnl15         26019   rind10.12226     07/19,10:43:46 00:00:15

STAT FRAME   TRY HOSTNAME        PID     JOBID            START          ELAPSED  NOTES
Done 0000    1   rind36          32371   rind10.12227     07/19,10:43:47 00:00:13
Done 0001    1   sunrender       12433   rind10.12227     07/19,10:43:48 00:00:23
Done 0002    1   rind37          31303   rind10.12227     07/19,10:43:50 00:00:12
Done 0003    1   bumblebee       14432   rind10.12227     07/19,10:43:51 00:00:07
Done 0004    1   burnl17         24706   rind10.12227     07/19,10:43:52 00:00:12
Done 0005    1   burnl13         921     rind10.12227     07/19,10:43:56 00:00:07
Done 0006    1   rind35          30118   rind10.12227     07/19,10:43:56 00:00:12
Done 0007    1   burnl16         24988   rind10.12227     07/19,10:43:58 00:00:11
Done 0008    1   bumblebee       14441   rind10.12227     07/19,10:43:59 00:00:04
Done 0009    1   burnl12         24667   rind10.12227     07/19,10:44:00 00:00:11
Done 0010    1   rind36          32455   rind10.12227     07/19,10:44:02 00:00:12
Done 0011    1   burnl11         32041   rind10.12227     07/19,10:44:02 00:00:11
Done 0012    1   rind8           29893   rind10.12227     07/19,10:44:03 00:00:11
Done 0013    1   rind9           24146   rind10.12227     07/19,10:44:03 00:00:12
Done 0014    1   rind37          31349   rind10.12227     07/19,10:44:04 00:00:11
Done 0015    1   burnl13         961     rind10.12227     07/19,10:44:04 00:00:07
Done 0016    1   burnl15         26074   rind10.12227     07/19,10:44:04 00:00:10
Done 0017    1   bumblebee       14448   rind10.12227     07/19,10:44:04 00:00:04
Done 0018    1   rind7           16188   rind10.12227     07/19,10:44:05 00:00:10
Done 0019    1   burnl17         24761   rind10.12227     07/19,10:44:05 00:00:10
Done 0020    1   rind38          24359   rind10.12227     07/19,10:44:05 00:00:13
Done 0021    1   rind35          30152   rind10.12227     07/19,10:44:11 00:00:26
Done 0022    1   burnl16         25027   rind10.12227     07/19,10:44:11 00:00:22
Done 0023    1   bumblebee       14455   rind10.12227     07/19,10:44:11 00:00:12
Done 0024    1   burnl12         24706   rind10.12227     07/19,10:44:11 00:00:23
Done 0025    1   sunrender       12554   rind10.12227     07/19,10:44:13 00:00:53
Done 0026    1   burnl13         1000    rind10.12227     07/19,10:44:13 00:00:15
Done 0027    1   burnl11         32081   rind10.12227     07/19,10:44:13 00:00:26
Done 0028    1   rind36          32516   rind10.12227     07/19,10:44:14 00:00:31
Done 0029    1   rind8           29929   rind10.12227     07/19,10:44:16 00:00:26
Done 0030    1   burnl15         26114   rind10.12227     07/19,10:44:16 00:00:27
Done 0031    1   burnl17         24800   rind10.12227     07/19,10:44:16 00:00:26
Done 0032    1   rind9           24180   rind10.12227     07/19,10:44:16 00:00:26
Done 0033    1   sunrender       12673   rind10.12227     07/19,10:44:17 00:00:50
Done 0034    1   rind7           16271   rind10.12227     07/19,10:44:17 00:00:29
Done 0035    1   rind37          31404   rind10.12227     07/19,10:44:17 00:00:33
Done 0036    1   rind39          27686   rind10.12227     07/19,10:44:18 00:01:50
Done 0037    1   rind39          27688   rind10.12227     07/19,10:44:18 00:02:40
Done 0038    1   rind38          24394   rind10.12227     07/19,10:44:20 00:00:35
Done 0039    1   bumblebee       14462   rind10.12227     07/19,10:44:24 00:00:13
Done 0040    1   burnl13         1058    rind10.12227     07/19,10:44:31 00:00:15
Done 0041    1   burnl16         25082   rind10.12227     07/19,10:44:34 00:00:28
Done 0042    1   burnl12         24761   rind10.12227     07/19,10:44:36 00:00:29
Done 0043    1   rind35          30210   rind10.12227     07/19,10:44:38 00:00:34
Done 0044    1   bumblebee       14471   rind10.12227     07/19,10:44:38 00:00:11
Done 0045    1   burnl11         32153   rind10.12227     07/19,10:44:41 00:00:29
Done 0046    1   rind8           29989   rind10.12227     07/19,10:44:43 00:00:28
Done 0047    1   rind9           24236   rind10.12227     07/19,10:44:44 00:00:28
Done 0048    1   burnl15         26185   rind10.12227     07/19,10:44:44 00:00:29
Done 0049    1   burnl17         24872   rind10.12227     07/19,10:44:45 00:00:28
Done 0050    1   rind7           16451   rind10.12227     07/19,10:44:47 00:00:29
Done 0051    1   rind36          32678   rind10.12227     07/19,10:44:47 00:00:35
Done 0052    1   burnl13         1103    rind10.12227     07/19,10:44:47 00:00:17
Done 0053    1   bumblebee       14478   rind10.12227     07/19,10:44:50 00:00:13
Done 0054    1   rind37          31549   rind10.12227     07/19,10:44:51 00:00:36
Done 0055    1   rind38          24464   rind10.12227     07/19,10:44:56 00:00:37
Done 0056    1   burnl16         25158   rind10.12227     07/19,10:45:03 00:00:30
Done 0057    1   bumblebee       14485   rind10.12227     07/19,10:45:04 00:00:13
Done 0058    1   burnl13         1165    rind10.12227     07/19,10:45:05 00:00:17
Done 0059    1   burnl12         24850   rind10.12227     07/19,10:45:07 00:00:30
Done 0060    1   sunrender       12800   rind10.12227     07/19,10:45:08 00:01:15
Done 0061    1   sunrender       12802   rind10.12227     07/19,10:45:08 00:01:15
Done 0062    1   burnl11         32224   rind10.12227     07/19,10:45:10 00:00:30
Done 0063    1   rind8           30048   rind10.12227     07/19,10:45:13 00:00:29
Done 0064    1   rind9           24320   rind10.12227     07/19,10:45:13 00:00:29
Done 0065    1   rind35          30269   rind10.12227     07/19,10:45:13 00:00:36
Done 0066    1   rind39          27737   rind10.12227     07/19,10:45:14 00:02:33
Done 0067    1   rind39          27739   rind10.12227     07/19,10:45:14 00:02:35
Done 0068    1   burnl15         26258   rind10.12227     07/19,10:45:14 00:00:31
Done 0069    1   burnl17         24944   rind10.12227     07/19,10:45:15 00:00:29
Done 0070    1   rind7           16511   rind10.12227     07/19,10:45:17 00:00:29
Done 0071    1   rind36          301     rind10.12227     07/19,10:45:23 00:00:37
Done 0072    1   burnl13         1223    rind10.12227     07/19,10:45:23 00:00:16
Done 0073    1   rind37          31622   rind10.12227     07/19,10:45:27 00:00:36
Done 0074    1   rind38          24557   rind10.12227     07/19,10:45:33 00:00:35
Done 0075    1   burnl16         25234   rind10.12227     07/19,10:45:33 00:00:28
Done 0076    1   burnl12         24921   rind10.12227     07/19,10:45:40 00:00:30
Done 0077    1   burnl13         1324    rind10.12227     07/19,10:45:40 00:00:19
Done 0078    1   burnl11         32297   rind10.12227     07/19,10:45:41 00:00:29
Done 0079    1   rind9           24546   rind10.12227     07/19,10:45:44 00:00:29
Done 0080    1   rind8           30106   rind10.12227     07/19,10:45:44 00:00:29
Done 0081    1   burnl17         25015   rind10.12227     07/19,10:45:45 00:00:30
Done 0082    1   burnl15         26330   rind10.12227     07/19,10:45:45 00:00:29
Done 0083    1   rind7           16567   rind10.12227     07/19,10:45:47 00:00:30
Done 0084    1   rind35          30338   rind10.12227     07/19,10:45:50 00:00:36
Done 0085    1   bumblebee       14502   rind10.12227     07/19,10:45:59 00:00:11
Done 0086    1   burnl13         1423    rind10.12227     07/19,10:46:00 00:00:16
Done 0087    1   rind36          500     rind10.12227     07/19,10:46:00 00:00:35
Done 0088    1   burnl16         25306   rind10.12227     07/19,10:46:03 00:00:30
Done 0089    1   rind37          31690   rind10.12227     07/19,10:46:05 00:00:36
Done 0090    1   rind38          24732   rind10.12227     07/19,10:46:11 00:00:34
Done 0091    1   rind39          27787   rind10.12227     07/19,10:46:12 00:02:21
Done 0092    1   burnl12         24992   rind10.12227     07/19,10:46:12 00:00:30
Done 0093    1   burnl11         32368   rind10.12227     07/19,10:46:12 00:00:28
Done 0094    1   bumblebee       14512   rind10.12227     07/19,10:46:12 00:00:14
Done 0095    1   rind9           24739   rind10.12227     07/19,10:46:14 00:00:29
Done 0096    1   rind8           30162   rind10.12227     07/19,10:46:15 00:00:29
Done 0097    1   burnl15         26402   rind10.12227     07/19,10:46:16 00:00:29
Done 0098    1   burnl17         25088   rind10.12227     07/19,10:46:16 00:00:29
Done 0099    1   burnl13         1481    rind10.12227     07/19,10:46:17 00:00:16
Done 0100    1   rind7           16625   rind10.12227     07/19,10:46:18 00:00:30
Done 0101    1   sunrender       12950   rind10.12227     07/19,10:46:24 00:01:14
Done 0102    1   sunrender       12952   rind10.12227     07/19,10:46:24 00:01:14
Done 0103    1   bumblebee       14519   rind10.12227     07/19,10:46:27 00:00:12
Done 0104    1   rind35          30405   rind10.12227     07/19,10:46:28 00:00:36
Done 0105    1   burnl13         1558    rind10.12227     07/19,10:46:34 00:00:17
Done 0106    1   burnl16         25378   rind10.12227     07/19,10:46:35 00:00:29
Done 0107    1   rind36          569     rind10.12227     07/19,10:46:38 00:00:36
Done 0108    1   bumblebee       14526   rind10.12227     07/19,10:46:39 00:00:13
Done 0109    1   burnl11         32439   rind10.12227     07/19,10:46:41 00:00:30
Done 0110    1   rind37          31747   rind10.12227     07/19,10:46:42 00:00:36
Done 0111    1   burnl12         25065   rind10.12227     07/19,10:46:44 00:00:30
Done 0112    1   rind8           30220   rind10.12227     07/19,10:46:45 00:00:29
Done 0113    1   rind9           24801   rind10.12227     07/19,10:46:45 00:00:29
Done 0114    1   rind38          24885   rind10.12227     07/19,10:46:46 00:00:36
Done 0115    1   burnl15         26474   rind10.12227     07/19,10:46:48 00:00:30
Done 0116    1   burnl17         25159   rind10.12227     07/19,10:46:48 00:00:29
Done 0117    1   rind7           16715   rind10.12227     07/19,10:46:49 00:00:30
Done 0118    1   burnl13         1618    rind10.12227     07/19,10:46:53 00:00:15
Done 0119    1   bumblebee       14535   rind10.12227     07/19,10:46:54 00:00:13
Done 0120    1   rind39          27811   rind10.12227     07/19,10:47:00 00:02:19
Done 0121    1   rind35          30514   rind10.12227     07/19,10:47:05 00:00:33
Done 0122    1   burnl16         25449   rind10.12227     07/19,10:47:06 00:00:30

   From: Greg Ercolano <erco@(email surpressed)>
Subject: Re: several strange Rush behaviours
   Date: Fri, 20 Jul 2012 13:14:35 -0400
Msg# 2261
View Complete Thread (5 articles) | All Threads
Last Next
On 07/20/12 03:31, Abraham Schneider wrote:
>>        Wow, you have large jobids! You must have the jobidmax value
>>       cranked up. Be careful with that (see below).
>
> Don't worry, it's just the number that I cranked up. Normaly there
> aren't more than 200-300 jobs in rush. We dump them regularly.

	Sounds good.

> I think
> I'm using these high numbers because when there are jobs on the farm and
> the maximum number is reached, new jobs will start with 0 again and
> these jobs will be rendered next, despite of older jobs waiting in the
> queue. FIFO is based on jobID I think!?

	FIFO is based primarily on the submit time of the job.
	The jobid is only used to break a tie, where two jobs
	submit during the same second.

>>       Can I see these reports for the rind10.12225/6/7 jobs? eg:
>>
>>               rush -lf rind10.12225 rind10.12226 rind10.12227
>>               rush -lc rind10.12225 rind10.12226 rind10.12227
>
> see separate mail to greg@(email surpressed)

	Great -- see the previous email for an analysis of those.
	Turns out those jobs all seemed OK relative to one another,
	so I requested the later jobs as well.

> Of course I know that. As you can see in the report, 81% done was not at
> the end of the shot, as the shot is quite long. You see the problem part
> where only the machine 'sunrender' is rendering frames.

	Hmm, I saw the sunrenders in the reports, but didn't see
	what was wrong there..?

>>> 2. problem:
>>> Most of the time, switching a machine/workstation from offline to
>>> online, it takes from many seconds to several minutes for this machine
>>> to pick up a frame and start rendering. The machine is shown as
>>> 'online' instantly, but it just won't start rendering a frame.
>>> It's listed as 'online' and 'idle' for several minutes. This happens
>>> for all of our machines, doesn't matter if they are Macs or Linux.
>>
>>       Can you send me the tasklist for the machine in question, ie:
>>
>>               rush -tasklist SLOWHOST
>
> see separate mail

D'oh:
	I think I forgot to go into details on this; I probably should
	have said 'when you online a machine and it's not picking up,
	send the tasklist at that moment when its not picking up.'

	And in such a case, it would also be useful to see what the state
	of all the jobs on the farm is, eg: 'rush -laj; rush -lac'.

	So perhaps you can resend those three (tasklist/laj/lac) right
	when the machine is onlined and not rendering.

So if it's not large leftover jobs, then it must be something
in the code.

Perhaps what is happening there is if a machine has been offline
for a while, and many jobs have come and gone but not dumped,
perhaps the scheduler has old jobs left behind in it, such that
when the machine comes online, it walks through all those old
jobs to find the next one to render, and it takes a while to
walk through them all to 'catch up'.

I better check this.. esp. in the context of a FIFO scheduler.

> seems not to be a problem there, the rushd.log of the problem machine =
> 'apu' is quite small:
>
> today:
> 07/20,03:00:27 ROTATE     rushd.log rotated. pid=3D146, 0/3 busy, OFFLINE
> 07/20,03:00:27 ROTATE     apu RUSHD 102.42a9d PID=3D146  Boot=07/17/12,19:07:25
>
> yesterday:
> 07/19,03:00:16 ROTATE     rushd.log rotated. pid=3D146, 0/3 busy, OFFLINE
> 07/19,03:00:16 ROTATE     apu RUSHD 102.42a9d PID=3D146     Boot=07/17/12,19:07:25
> 07/19,18:23:56 SECURITY   Daemon changed to Online by aschneid@itchy[192.168.10.21], Remark:online by aschneid 07/19/12,18:23 via irush state (online) by aschneid@itchy[192.168.10.21]
> 07/19,19:02:03 SECURITY   Daemon changed to Getoff by aschneid@itchy[192.168.10.21], Remark:getoff by aschneid 07/19/12,19:02 via irush state (getoff) by aschneid@itchy[192.168.10.21]
> 07/20,03:00:27 ROTATE     rushd.log rotated by rush.conf: logrotatehour=3

	Yes, those are /really/ healthy logs.
	Scary in fact.

	But it looks like it was only online for 40 mins (between 6:23p ~ 7p),
	so it probably didn't do much rendering, if any.

-- 
Greg Ercolano, erco@(email surpressed)
Seriss Corporation
Rush Render Queue, http://seriss.com/rush/
Tel: (Tel# suppressed)ext.23
Fax: (Tel# suppressed)
Cel: (Tel# suppressed)