From: Dylan Penhale <dylan@(email surpressed).au> Subject: Shake INIT_Processeses problem Date: Mon, 24 Jul 2006 22:58:23 -0400 |
Msg# 1351 View Complete Thread (10 articles) | All Threads Last Next |
Has anyone seen the following error when trying to render shake jobs
through rush?
Executing: shake -exec /var/tmp/.RUSH_TMP.42/re_245_330_x005sc_F003.shk -t 26-26 -proxyscale Base -vv -cpus 2 INIT_Processeses(), could not establish the default connection to the WindowServer.--- shake: terminated by signal 6 This is only happening on 3 machines, the others are fine.The 3 machines are able to resolve DNS, and get the UID/GID of the submitting user. Shake runs fine on these boxes.I notice that this may be similar to the AE issue listed here: http:// seriss.com/rush-current/issues-afterfx-6.5/index.html Should I change the shake owner to 0:0 on the problem hosts? I can't figure why only some boxes have the problem. Regards Dylan Penhale Systems Administrator Fuel International |
From: Greg Ercolano <erco@(email surpressed)> Subject: Re: Shake INIT_Processeses problem Date: Mon, 24 Jul 2006 23:16:20 -0400 |
Msg# 1352 View Complete Thread (10 articles) | All Threads Last Next |
> INIT_Processeses(), could not establish the default > connection to the WindowServer.--- shake: terminated by signal 6 Sounds like shake is trying to access the window manager when it shouldn't be. The two most common causes of this: 1) User error -- the shake file is trying to render to the screen, instead of rendering to a file. 2) Bad OS library (eg. quicktime) loaded by shake that is trying to manipulate the window manager. Regaring #1, try running the same shake command from a terminal to see if it opens a GUI. If it does, that's the problem. If it doesn't, then it's probably #2, which means some OSX library (that shake is loading) is trying to access the window manager when the library is loaded and initialized. In the past I've seen QuickTime libraries cause this, where someone either updated the quicktime libs from Apple with buggy libs causing the problem, or a recent OS re-install from CDs that DIDN'T take the latest updates from Apple. > This is only happening on 3 machines, the others are fine. Check the patch level of the machines (ie. run 'sw_vers' on each box) You can probably replicate this problem by ssh'ing into the same machine that rendered the frame and failed, and logging in as the same user the rush render was running shake as. This user likely doesn't match the user logged into the window manager, and thus the error about being unable to connect to the window manager. Shake renders should not be trying to access the window manager unless something is wrong.. ie. #1 or #2 above. Dylan Penhale wrote: [posted to rush.general]Has anyone seen the following error when trying to render shake jobs through rush?Executing: shake -exec /var/tmp/.RUSH_TMP.42/re_245_330_x005sc_F003.shk -t 26-26 -proxyscale Base -vv -cpus 2 INIT_Processeses(), could not establish the default connection to the WindowServer.--- shake: terminated by signal 6 This is only happening on 3 machines, the others are fine.The 3 machines are able to resolve DNS, and get the UID/GID of the submitting user.Shake runs fine on these boxes.I notice that this may be similar to the AE issue listed here: http://seriss.com/rush-current/issues-afterfx-6.5/index.htmlShould I change the shake owner to 0:0 on the problem hosts? I can't figure why only some boxes have the problem.Regards Dylan Penhale Systems Administrator Fuel International -- Greg Ercolano, erco@(email surpressed) Rush Render Queue, http://seriss.com/rush/ Tel: (Tel# suppressed) Fax: (Tel# suppressed) Cel: (Tel# suppressed) |
From: Dylan Penhale <dylanpenhale@(email surpressed)> Subject: RE: Shake INIT_Processeses problem Date: Tue, 01 Aug 2006 22:32:07 -0400 |
Msg# 1359 View Complete Thread (10 articles) | All Threads Last Next |
Thanks Greg If I ssh into the problem machine as the user that submits the job and try to launch shake I get: kCGErrorRangeCheck : Window Server communications from outside of session allowed for root and console user only INIT_Processeses(), could not establish the default connection to the WindowServer.Abort trap However I get that error on other machines that "are" able to render out the frame fine. This problem is intermittent too. Some times the machine can render, occasionally we get this: Executing: shake -exec /var/tmp/.RUSH_TMP.42/re_245_330_x005sc_F003.shk -t 26-26 -proxyscale Base -vv -cpus 2 INIT_Processeses(), could not establish the default connection to the WindowServer.--- shake: terminated by signal 6 I think the error is linked though. The user that is having the problem has a lower ID than others (169 compared to the usual 1000+) and I do remember reading something about low ID's being a problem for Mac machines. I will change his ID and report back. -----Original Message----- From: Greg Ercolano [mailto:erco@(email surpressed)] Sent: 25 July 2006 13:16 To: void@(email surpressed) Subject: Re: Shake INIT_Processeses problem [posted to rush.general] > INIT_Processeses(), could not establish the default > connection to the WindowServer.--- shake: terminated by signal 6 Sounds like shake is trying to access the window manager when it shouldn't be. The two most common causes of this: 1) User error -- the shake file is trying to render to the screen, instead of rendering to a file. 2) Bad OS library (eg. quicktime) loaded by shake that is trying to manipulate the window manager. Regaring #1, try running the same shake command from a terminal to see if it opens a GUI. If it does, that's the problem. If it doesn't, then it's probably #2, which means some OSX library (that shake is loading) is trying to access the window manager when the library is loaded and initialized. In the past I've seen QuickTime libraries cause this, where someone either updated the quicktime libs from Apple with buggy libs causing the problem, or a recent OS re-install from CDs that DIDN'T take the latest updates from Apple. > This is only happening on 3 machines, the others are fine. Check the patch level of the machines (ie. run 'sw_vers' on each box) You can probably replicate this problem by ssh'ing into the same machine that rendered the frame and failed, and logging in as the same user the rush render was running shake as. This user likely doesn't match the user logged into the window manager, and thus the error about being unable to connect to the window manager. Shake renders should not be trying to access the window manager unless something is wrong.. ie. #1 or #2 above. Dylan Penhale wrote: > [posted to rush.general] > > Has anyone seen the following error when trying to render shake jobs > through rush? > > Executing: shake -exec > /var/tmp/.RUSH_TMP.42/re_245_330_x005sc_F003.shk -t 26-26 -proxyscale > Base -vv -cpus 2 INIT_Processeses(), could not establish the default > connection to the WindowServer.--- shake: terminated by signal 6 > > This is only happening on 3 machines, the others are fine. > The 3 machines are able to resolve DNS, and get the UID/GID of the > submitting user. > Shake runs fine on these boxes. > > I notice that this may be similar to the AE issue listed here: > http://seriss.com/rush-current/issues-afterfx-6.5/index.html > > Should I change the shake owner to 0:0 on the problem hosts? I can't > figure why only some boxes have the problem. > > > Regards > > Dylan Penhale > Systems Administrator > Fuel International > > > > > -- Greg Ercolano, erco@(email surpressed) Rush Render Queue, http://seriss.com/rush/ Tel: (Tel# suppressed) Fax: (Tel# suppressed) Cel: (Tel# suppressed) |
From: Greg Ercolano <erco@(email surpressed)> Subject: Re: Shake INIT_Processeses problem Date: Wed, 02 Aug 2006 00:39:45 -0400 |
Msg# 1360 View Complete Thread (10 articles) | All Threads Last Next |
Dylan Penhale wrote: [posted to rush.general] Thanks Greg If I ssh into the problem machine as the user that submits the job and try to launch shake I get: kCGErrorRangeCheck : Window Server communications from outside of session allowed for root and console user only INIT_Processeses(), could not establish the default connection to the WindowServer.Abort trap However I get that error on other machines that "are" able to render out the frame fine. This problem is intermittent too. Some times the machine can render, occasionally we get this: Executing: shake -exec/var/tmp/.RUSH_TMP.42/re_245_330_x005sc_F003.shk -t 26-26 -proxyscale Base -vv -cpus 2 INIT_Processeses(), could not establish the default connection to the WindowServer.--- shake: terminated by signal 6I think the error is linked though. The user that is having the problem has a lower ID than others (169 compared to the usual 1000+) and I do rememberreading something about low ID's being a problem for Mac machines.I will change his ID and report back.-----Original Message-----From: Greg Ercolano [mailto:erco@(email surpressed)] Sent: 25 July 2006 13:16To: void@(email surpressed) Subject: Re: Shake INIT_Processeses problem [posted to rush.general] > INIT_Processeses(), could not establish the default > connection to the WindowServer.--- shake: terminated by signal 6 Sounds like shake is trying to access the window manager when it shouldn't be. The two most common causes of this: 1) User error -- the shake file is trying to render to the screen, instead of rendering to a file. 2) Bad OS library (eg. quicktime) loaded by shake that is trying to manipulate the window manager. Regaring #1, try running the same shake command from a terminal to see if it opens a GUI. If it does, that's the problem. If it doesn't, then it's probably #2, which means some OSX library (that shake is loading) is trying to access the window manager when the library is loaded and initialized. In the past I've seen QuickTime libraries cause this, where someone either updated the quicktime libs from Apple with buggy libs causing the problem, or a recent OS re-install from CDs that DIDN'T take the latest updates from Apple. > This is only happening on 3 machines, the others are fine. Check the patch level of the machines (ie. run 'sw_vers' on each box) You can probably replicate this problem by ssh'ing into the same machine that rendered the frame and failed, and logging in as the same user the rush render was running shake as. This user likely doesn't match the user logged into the window manager, and thus the error about being unable to connect to the window manager. Shake renders should not be trying to access the window manager unless something is wrong.. ie. #1 or #2 above. Dylan Penhale wrote:[posted to rush.general]Has anyone seen the following error when trying to render shake jobs through rush?Executing: shake -exec/var/tmp/.RUSH_TMP.42/re_245_330_x005sc_F003.shk -t 26-26 -proxyscale Base -vv -cpus 2 INIT_Processeses(), could not establish the default connection to the WindowServer.--- shake: terminated by signal 6This is only happening on 3 machines, the others are fine.The 3 machines are able to resolve DNS, and get the UID/GID of the submitting user.Shake runs fine on these boxes.I notice that this may be similar to the AE issue listed here: http://seriss.com/rush-current/issues-afterfx-6.5/index.htmlShould I change the shake owner to 0:0 on the problem hosts? Doing a chmod 4755; chown 0:0 will surely by pass the problem, similar to how that 'fixes' the problem with AfterFx. It's not a great solution, of course, as it makes the program run as root, and the files it reads/writes are accessed as root too. But in a production environment, a sysadmin has ta 'do what you gotta do' to keep the production locomotive running on the track, permissions be damned. I can't figure why only some boxes have the problem. I'd bet it's a library issue or plugin issue, or a combo of the two where some machines have different versions of libraries and/or plugins than others. In ssh, you might try ktrace'ing the binary to see if you can determine /which/ library is being initialized that is causing the problem. Sometimes libraries initialize right after they load, giving a tell-tale sign as to the problem. If you can figure out which lib it is, you might then be able to compare the file size or rev number of that lib against the working machines. -- Greg Ercolano, erco@(email surpressed) Rush Render Queue, http://seriss.com/rush/ Tel: (Tel# suppressed) Fax: (Tel# suppressed) Cel: (Tel# suppressed) |
From: Dylan Penhale <dylanpenhale@(email surpressed)> Subject: RE: Shake INIT_Processeses problem Date: Fri, 04 Aug 2006 03:41:49 -0400 |
Msg# 1362 View Complete Thread (10 articles) | All Threads Last Next |
I have just noticed that this is happening on some boxes that are not rendering. We have just started rolling out a few of the 102.42a6 update to the farm today. Do you think this is related? -----Original Message----- From: Greg Ercolano [mailto:erco@(email surpressed)] Sent: 02 August 2006 14:40 To: void@(email surpressed) Subject: Re: Shake INIT_Processeses problem [posted to rush.general] Dylan Penhale wrote: > [posted to rush.general] > > Thanks Greg > > If I ssh into the problem machine as the user that submits the job and > try to launch shake I get: > > kCGErrorRangeCheck : Window Server communications from outside of > session allowed for root and console user only INIT_Processeses(), > could not establish the default connection to the WindowServer.Abort > trap > > However I get that error on other machines that "are" able to render > out the frame fine. > > This problem is intermittent too. Some times the machine can render, > occasionally we get this: > Executing: shake -exec > /var/tmp/.RUSH_TMP.42/re_245_330_x005sc_F003.shk -t 26-26 -proxyscale > Base -vv -cpus 2 INIT_Processeses(), could not establish the default > connection to the WindowServer.--- shake: terminated by signal 6 > > I think the error is linked though. The user that is having the > problem has a lower ID than others (169 compared to the usual 1000+) > and I do remember reading something about low ID's being a problem for Mac machines. > > I will change his ID and report back. > > > > > > -----Original Message----- > From: Greg Ercolano [mailto:erco@(email surpressed)] > Sent: 25 July 2006 13:16 > To: void@(email surpressed) > Subject: Re: Shake INIT_Processeses problem > > [posted to rush.general] > > > INIT_Processeses(), could not establish the default > connection > to the > WindowServer.--- shake: terminated by signal 6 > > Sounds like shake is trying to access the window manager when it > shouldn't be. > > The two most common causes of this: > > 1) User error -- the shake file is trying to render > to the screen, instead of rendering to a file. > > 2) Bad OS library (eg. quicktime) loaded by shake > that is trying to manipulate the window manager. > > Regaring #1, try running the same shake command from a terminal to see > if it opens a GUI. If it does, that's the problem. > > If it doesn't, then it's probably #2, which means some OSX library > (that shake is loading) is trying to access the window manager when > the library is loaded and initialized. > > In the past I've seen QuickTime libraries cause this, where someone > either updated the quicktime libs from Apple with buggy libs causing > the problem, or a recent OS re-install from CDs that DIDN'T take the > latest updates from Apple. > > > This is only happening on 3 machines, the others are fine. > > Check the patch level of the machines (ie. run 'sw_vers' on each box) > > You can probably replicate this problem by ssh'ing into the same > machine that rendered the frame and failed, and logging in as the same > user the rush render was running shake as. This user likely doesn't > match the user logged into the window manager, and thus the error > about being unable to connect to the window manager. > > Shake renders should not be trying to access the window manager unless > something is wrong.. ie. #1 or #2 above. > > Dylan Penhale wrote: >> [posted to rush.general] >> >> Has anyone seen the following error when trying to render shake jobs >> through rush? >> >> Executing: shake -exec >> /var/tmp/.RUSH_TMP.42/re_245_330_x005sc_F003.shk -t 26-26 -proxyscale >> Base -vv -cpus 2 INIT_Processeses(), could not establish the default >> connection to the WindowServer.--- shake: terminated by signal 6 >> >> This is only happening on 3 machines, the others are fine. >> The 3 machines are able to resolve DNS, and get the UID/GID of the >> submitting user. >> Shake runs fine on these boxes. >> >> I notice that this may be similar to the AE issue listed here: >> http://seriss.com/rush-current/issues-afterfx-6.5/index.html >> >> Should I change the shake owner to 0:0 on the problem hosts? Doing a chmod 4755; chown 0:0 will surely by pass the problem, similar to how that 'fixes' the problem with AfterFx. It's not a great solution, of course, as it makes the program run as root, and the files it reads/writes are accessed as root too. But in a production environment, a sysadmin has ta 'do what you gotta do' to keep the production locomotive running on the track, permissions be damned. > I can't figure why only some boxes have the problem. I'd bet it's a library issue or plugin issue, or a combo of the two where some machines have different versions of libraries and/or plugins than others. In ssh, you might try ktrace'ing the binary to see if you can determine /which/ library is being initialized that is causing the problem. Sometimes libraries initialize right after they load, giving a tell-tale sign as to the problem. If you can figure out which lib it is, you might then be able to compare the file size or rev number of that lib against the working machines. -- Greg Ercolano, erco@(email surpressed) Rush Render Queue, http://seriss.com/rush/ Tel: (Tel# suppressed) Fax: (Tel# suppressed) Cel: (Tel# suppressed) |
From: Greg Ercolano <erco@(email surpressed)> Subject: Re: Shake INIT_Processeses problem Date: Fri, 04 Aug 2006 04:12:29 -0400 |
Msg# 1364 View Complete Thread (10 articles) | All Threads Last Next |
Dylan Penhale wrote: I have just noticed that this is happening on some boxes that are not rendering. Hmm, not sure I follow. This error shouldn't have anything to do with whether the machines are rendering anything. The error is caused by shake trying to access the window manager, and failing because it isn't being invoked by the same user logged into the window manager. Actually I doubt it's shake's code that's responsible for the error (unless the user is rendering to the screen). It's more likely that shake is loading a dynamic library from the OS (like the quicktime lib), and the library's initialization code is trying to manipulate or in some way access the window manager. We have just started rolling out a few of the 102.42a6 update to the farm today. Do you think this is related? No, I can't see how the Rush install could impact shake. You can replicate this problem with ssh entirely outside of rush, so the only possible way rush could affect shake is if rush manipulated the shake directory or the OS libraries.. it doesn't. Maybe I'm missing something about your question. By what means do you think one program could affect the other? The only files the rush install modifies outside of the rush directory is the rush boot script, and adding rush/bin to the PATH of the csh and sh startup files. -- Greg Ercolano, erco@(email surpressed) Rush Render Queue, http://seriss.com/rush/ Tel: (Tel# suppressed) Fax: (Tel# suppressed) Cel: (Tel# suppressed) |
From: Dylan Penhale <dylanpenhale@(email surpressed)> Subject: RE: Shake INIT_Processeses problem Date: Tue, 22 Aug 2006 01:31:19 -0400 |
Msg# 1377 View Complete Thread (10 articles) | All Threads Last Next |
To follow up on this issue, we found that it WAS in fact Quicktime that was failing. Only on certain machines when trying to render shake scenes containing Quicktime files. When Quicktime was unable to open an error dialogue box to inform the user of the error we got the error about the windows manager being displayed in the shake log, presumably because it gets written to stderr. Reinstalling Quicktime on the few boxes with this problem has remedied the problem. Thanks Greg -----Original Message----- From: Greg Ercolano [mailto:erco@(email surpressed)] Sent: 04 August 2006 18:12 To: void@(email surpressed) Subject: Re: Shake INIT_Processeses problem [posted to rush.general] Dylan Penhale wrote: > I have just noticed that this is happening on some boxes that are not rendering. Hmm, not sure I follow. This error shouldn't have anything to do with whether the machines are rendering anything. The error is caused by shake trying to access the window manager, and failing because it isn't being invoked by the same user logged into the window manager. Actually I doubt it's shake's code that's responsible for the error (unless the user is rendering to the screen). It's more likely that shake is loading a dynamic library from the OS (like the quicktime lib), and the library's initialization code is trying to manipulate or in some way access the window manager. > We have just started rolling out a few of the 102.42a6 update to the > farm today. Do you think this is related? No, I can't see how the Rush install could impact shake. You can replicate this problem with ssh entirely outside of rush, so the only possible way rush could affect shake is if rush manipulated the shake directory or the OS libraries.. it doesn't. Maybe I'm missing something about your question. By what means do you think one program could affect the other? The only files the rush install modifies outside of the rush directory is the rush boot script, and adding rush/bin to the PATH of the csh and sh startup files. -- Greg Ercolano, erco@(email surpressed) Rush Render Queue, http://seriss.com/rush/ Tel: (Tel# suppressed) Fax: (Tel# suppressed) Cel: (Tel# suppressed) |
From: Dylan Penhale <dylanpenhale@(email surpressed)> Subject: RE: Shake INIT_Processeses problem Date: Fri, 04 Aug 2006 04:21:56 -0400 |
Msg# 1366 View Complete Thread (10 articles) | All Threads Last Next |
Another thing. When I kill the rshd, I notice mayabatch restart several times afterwards. I have to kill it about 3 times. It looks like something is trying to relaunch it. Would the perl that rush calls do something like this? -----Original Message----- From: Dylan Penhale [mailto:dylanpenhale@(email surpressed)] Sent: 04 August 2006 17:42 To: void@(email surpressed) Subject: RE: Shake INIT_Processeses problem [posted to rush.general] I have just noticed that this is happening on some boxes that are not rendering. We have just started rolling out a few of the 102.42a6 update to the farm today. Do you think this is related? -----Original Message----- From: Greg Ercolano [mailto:erco@(email surpressed)] Sent: 02 August 2006 14:40 To: void@(email surpressed) Subject: Re: Shake INIT_Processeses problem [posted to rush.general] Dylan Penhale wrote: > [posted to rush.general] > > Thanks Greg > > If I ssh into the problem machine as the user that submits the job and > try to launch shake I get: > > kCGErrorRangeCheck : Window Server communications from outside of > session allowed for root and console user only INIT_Processeses(), > could not establish the default connection to the WindowServer.Abort > trap > > However I get that error on other machines that "are" able to render > out the frame fine. > > This problem is intermittent too. Some times the machine can render, > occasionally we get this: > Executing: shake -exec > /var/tmp/.RUSH_TMP.42/re_245_330_x005sc_F003.shk -t 26-26 -proxyscale > Base -vv -cpus 2 INIT_Processeses(), could not establish the default > connection to the WindowServer.--- shake: terminated by signal 6 > > I think the error is linked though. The user that is having the > problem has a lower ID than others (169 compared to the usual 1000+) > and I do remember reading something about low ID's being a problem for > Mac machines. > > I will change his ID and report back. > > > > > > -----Original Message----- > From: Greg Ercolano [mailto:erco@(email surpressed)] > Sent: 25 July 2006 13:16 > To: void@(email surpressed) > Subject: Re: Shake INIT_Processeses problem > > [posted to rush.general] > > > INIT_Processeses(), could not establish the default > connection > to the > WindowServer.--- shake: terminated by signal 6 > > Sounds like shake is trying to access the window manager when it > shouldn't be. > > The two most common causes of this: > > 1) User error -- the shake file is trying to render > to the screen, instead of rendering to a file. > > 2) Bad OS library (eg. quicktime) loaded by shake > that is trying to manipulate the window manager. > > Regaring #1, try running the same shake command from a terminal to see > if it opens a GUI. If it does, that's the problem. > > If it doesn't, then it's probably #2, which means some OSX library > (that shake is loading) is trying to access the window manager when > the library is loaded and initialized. > > In the past I've seen QuickTime libraries cause this, where someone > either updated the quicktime libs from Apple with buggy libs causing > the problem, or a recent OS re-install from CDs that DIDN'T take the > latest updates from Apple. > > > This is only happening on 3 machines, the others are fine. > > Check the patch level of the machines (ie. run 'sw_vers' on each box) > > You can probably replicate this problem by ssh'ing into the same > machine that rendered the frame and failed, and logging in as the same > user the rush render was running shake as. This user likely doesn't > match the user logged into the window manager, and thus the error > about being unable to connect to the window manager. > > Shake renders should not be trying to access the window manager unless > something is wrong.. ie. #1 or #2 above. > > Dylan Penhale wrote: >> [posted to rush.general] >> >> Has anyone seen the following error when trying to render shake jobs >> through rush? >> >> Executing: shake -exec >> /var/tmp/.RUSH_TMP.42/re_245_330_x005sc_F003.shk -t 26-26 -proxyscale >> Base -vv -cpus 2 INIT_Processeses(), could not establish the default >> connection to the WindowServer.--- shake: terminated by signal 6 >> >> This is only happening on 3 machines, the others are fine. >> The 3 machines are able to resolve DNS, and get the UID/GID of the >> submitting user. >> Shake runs fine on these boxes. >> >> I notice that this may be similar to the AE issue listed here: >> http://seriss.com/rush-current/issues-afterfx-6.5/index.html >> >> Should I change the shake owner to 0:0 on the problem hosts? Doing a chmod 4755; chown 0:0 will surely by pass the problem, similar to how that 'fixes' the problem with AfterFx. It's not a great solution, of course, as it makes the program run as root, and the files it reads/writes are accessed as root too. But in a production environment, a sysadmin has ta 'do what you gotta do' to keep the production locomotive running on the track, permissions be damned. > I can't figure why only some boxes have the problem. I'd bet it's a library issue or plugin issue, or a combo of the two where some machines have different versions of libraries and/or plugins than others. In ssh, you might try ktrace'ing the binary to see if you can determine /which/ library is being initialized that is causing the problem. Sometimes libraries initialize right after they load, giving a tell-tale sign as to the problem. If you can figure out which lib it is, you might then be able to compare the file size or rev number of that lib against the working machines. -- Greg Ercolano, erco@(email surpressed) Rush Render Queue, http://seriss.com/rush/ Tel: (Tel# suppressed) Fax: (Tel# suppressed) Cel: (Tel# suppressed) |
From: Greg Ercolano <erco@(email surpressed)> Subject: Re: Shake INIT_Processeses problem Date: Fri, 04 Aug 2006 05:07:10 -0400 |
Msg# 1367 View Complete Thread (10 articles) | All Threads Last Next |
Dylan Penhale wrote: Another thing. When I kill the rshd.. rshd or rushd? I'm guessing you mean rushd, as rush doesn't make use of rsh or rshd. Not sure why you're killing rushd. You should probably just requeue the frame via irush (or via 'rush -que') so the script and its process hierarchy get killed correctly. If you try to kill the mayabatch process, the render script will probably think the render failed due to an error, and it's retrying up to three times before giving up on the machine. (The user probably has "Retries: 3" set when they submitted the job; this retry behavior is in the render script) I notice mayabatch restart several times afterwards. mayabatch, or shake? (This thread is about a problem with shake, so I guess I'm not sure how mayabatch snuck in.. maybe you're trying to kill other renders to see if they're affecting shake) I have to kill it about 3 times. It looks like something is trying to relaunch it. Would the perl that rush calls do something like this? The log for the frame you're trying to kill will probably show the retry messages from the script. The way rush kills a frame is to kill the entire process group, starting at the perl script. So if the process tree is something like this: 111 rushd \ 112 perl /path/to/renderscript \ 113 maya -batch ..then rush would invoke killpg(2) on PID 112 to kill perl and maya with a SIGKILL. Under most versions of Unix I've seen, kill(1) (ie. /bin/kill) can signal a process group by specifying a negative number for the PID. Oddly, the OSX man page for kill(1) makes no mention of process groups at all (!) Maybe this is another great man page omission. Probably the TCSH and BASH built-in versions of kill(1) support this, I'm not sure. The easy thing to do would be to use 'rush -getoff; rush -online' to quickly kill any renders on the local box, regardless of platform. -- Greg Ercolano, erco@(email surpressed) Rush Render Queue, http://seriss.com/rush/ Tel: (Tel# suppressed) Fax: (Tel# suppressed) Cel: (Tel# suppressed) |
From: Dylan Penhale <dylanpenhale@(email surpressed)> Subject: RE: Shake INIT_Processeses problem Date: Fri, 04 Aug 2006 20:53:19 -0400 |
Msg# 1368 View Complete Thread (10 articles) | All Threads Last Next |
You are right. I have no idea how I got started back on this old thread, Friday night was pretty busy :) Sorry about this, I'll kiil this thread and re-open another. -----Original Message----- From: Greg Ercolano [mailto:erco@(email surpressed)] Sent: 04 August 2006 19:07 To: void@(email surpressed) Subject: Re: Shake INIT_Processeses problem [posted to rush.general] Dylan Penhale wrote: > Another thing. When I kill the rshd.. rshd or rushd? I'm guessing you mean rushd, as rush doesn't make use of rsh or rshd. Not sure why you're killing rushd. You should probably just requeue the frame via irush (or via 'rush -que') so the script and its process hierarchy get killed correctly. If you try to kill the mayabatch process, the render script will probably think the render failed due to an error, and it's retrying up to three times before giving up on the machine. (The user probably has "Retries: 3" set when they submitted the job; this retry behavior is in the render script) > I notice mayabatch restart several times afterwards. mayabatch, or shake? (This thread is about a problem with shake, so I guess I'm not sure how mayabatch snuck in.. maybe you're trying to kill other renders to see if they're affecting shake) > I have to kill it about 3 times. It looks like something is trying to > relaunch it. Would the perl that rush calls do something like this? The log for the frame you're trying to kill will probably show the retry messages from the script. The way rush kills a frame is to kill the entire process group, starting at the perl script. So if the process tree is something like this: 111 rushd \ 112 perl /path/to/renderscript \ 113 maya -batch ..then rush would invoke killpg(2) on PID 112 to kill perl and maya with a SIGKILL. Under most versions of Unix I've seen, kill(1) (ie. /bin/kill) can signal a process group by specifying a negative number for the PID. Oddly, the OSX man page for kill(1) makes no mention of process groups at all (!) Maybe this is another great man page omission. Probably the TCSH and BASH built-in versions of kill(1) support this, I'm not sure. The easy thing to do would be to use 'rush -getoff; rush -online' to quickly kill any renders on the local box, regardless of platform. -- Greg Ercolano, erco@(email surpressed) Rush Render Queue, http://seriss.com/rush/ Tel: (Tel# suppressed) Fax: (Tel# suppressed) Cel: (Tel# suppressed) |