From: Jon Herman <jonh@(email surpressed)> Subject: How do I force a machine with multiple CPUs to use only one CPU for Date: Mon, 08 May 2006 10:51:44 -0400 |
Msg# 1284 View Complete Thread (2 articles) | All Threads Last Next |
I would like to be able to create a rush group that would limit the
number of CPUs to be used in a specific job.
Here's my problem: I need to be able to use all of the CPUs on a host for rendering with XSI, but I also need to be able to use those same muli-cpu hosts with an application that work using only one CPU. So, I'd like to retain the multiple processors for XSI jobs, but use a single processor for another type of job, on the same set of hosts. I know I can describe the host's CPUs in the rush.conf file, but is there a way to make rush think the host only has one processor when it's using a specific group? Best, Jon Herman Troublemaker Studios |
From: Greg Ercolano <erco@(email surpressed)> Subject: Re: How do I force a machine with multiple CPUs to use only one CPU Date: Mon, 08 May 2006 17:23:26 -0400 |
Msg# 1286 View Complete Thread (2 articles) | All Threads Last Next |
Jon Herman wrote: [posted to rush.general]I would like to be able to create a rush group that would limit the number of CPUs to be used in a specific job.Here's my problem: I need to be able to use all of the CPUs on a host for rendering with XSI, but I also need to be able to use those same muli-cpu hosts with an application that work using only one CPU.So, I'd like to retain the multiple processors for XSI jobs, but use a single processor for another type of job, on the same set of hosts. I know I can describe the host's CPUs in the rush.conf file, but is there a way to make rush think the host only has one processor when it's using a specific group? Hi Jon, It sounds like you want a single XSI frame to take up the whole machine when it runs, blocking other jobs from using the other cpus, and causing rush to only run one instance of XSI. And when single thread non-XSI frames are running on the cpus, they just take up one cpu, each, such that several frames can run on a machine, one per cpu. There are a few techniques defined here: http://www.seriss.com/rush-current/rush/rush-techniques.html#Threading None of these approaches are 'pretty'; the issue is that for rush to do it properly, when a cpu becomes available, rush would have to hold it available until the OTHER cpu also freed up, so that both cpus would be free when the XSI job runs. Currently rush doesn't hold a cpu free to wait for other cpu(s) to free up to run a job. But I see no way to do it without doing that. The 'using ram to reserve cpus' approach (#1 in the above) comes close, but the job will only take a cpu if both cpus are available. You can submit the XSI job with a higher priority than others using the 'k' flag, to ensure it first bumps other jobs out of the way, so that it can secure both processors. For instance; say all the machines on your farm are configured with 4096 of ram in rush (ie. the 'RAM' field in the rush/etc/hosts file are all set to 4096), then submitting an XSI job with: # SUBMIT XSI JOB rush -submit << EOF : ram 4096 cpus +any=5@10k : EOF ..will cause the job to request to use all the ram on each machine, and submits asking for 5 cpus at 10k priority. This way if two processors on a machine are each rendering single threaded maya jobs at a lower priority, the above XSI job will bump those two maya jobs out of the way. because: > the 10k (the k=kill) will ensure other jobs are cleared off, because this job will kill off other lower priority jobs to clear up enough ram to run this one > the "ram 4096" guarantees all ram will be reserved to this job's frame, preventing other jobs from jumping in, and also preventing this job from using more than one cpu on each machine For instance, here's a maya job using both processors of all machines on a small network of 4 machines, each with dual procs, running at a priority of 5: [erco@ontario] : rush -lac HOST OWNER JOBID TITLE FRAME PRI PID ELAPSED REMARKS ontario erco ontario.56 MAYA_JOB 0007 5 7392 00:05:02 ontario erco ontario.56 MAYA_JOB 0008 5 7394 00:05:02 rotwang erco ontario.56 MAYA_JOB 0001 5 29699 00:05:03 rotwang erco ontario.56 MAYA_JOB 0004 5 29701 00:05:03 meade erco ontario.56 MAYA_JOB 0002 5 32204 00:05:03 meade erco ontario.56 MAYA_JOB 0003 5 32206 00:05:03 tower erco ontario.56 MAYA_JOB 0005 5 5062 00:05:03 tower erco ontario.56 MAYA_JOB 0006 5 5063 00:05:03 Now I submit an XSI job asking for all the ram on each machine (4096) and asking for +any=3@10k, and ram of 4096: # SUBMIT XSI JOB rush -submit << EOF : ram 4096 cpus +any=5@100k : EOF As soon as the job is submitted, 3 of the 4 machines will get their maya frames bumped (and requeued), putting the XSI job in their place, one XSI frame per machine, leaving the other cpu on each machine unavailable: [erco@ontario] : rush -lac HOST OWNER JOBID TITLE FRAME PRI PID ELAPSED REMARKS ontario erco ontario.58 XSI 0002 10k 7461 00:00:09 ontario - - - - - - Online rotwang erco ontario.58 XSI 0001 10k 29712 00:00:10 rotwang - - - - - - Online meade erco ontario.58 XSI 0003 10k 32214 00:00:09 meade - - - - - - Online tower erco ontario.56 MAYA_JOB 0005 5 5062 00:14:36 tower erco ontario.56 MAYA_JOB 0006 5 5063 00:14:36 When you look at the ram available on the machines running XSI, you'll see the XSI job is taking all the ram, leaving none for other jobs, preventing other jobs from sneaking in: [erco@ontario] : rush -ramlist rotwang STATE JOBID/TITLE PRI RAMUSE NOTES Busy ontario.58,XSI 10k 4096 <-- asking for all the ram ------ 4096 Total ram on rotwang: 4096 Available ram on rotwang: 0 <-- no ram available for other jobs to use the other cpu Note how only "tower" has two MAYA jobs running; the other 3 machines are taken over by the XSI job, with only one cpu busy each. This is not a perfect solution, but it does get you what you want. Or you can use the 'reserve' approach (#3 in the above link) where you might make a +xsi group, and then reserve the extra processors on each machine with a 'sleep' job, and submit the XSI frames to just that +xsi group. -- Greg Ercolano, erco@(email surpressed) Rush Render Queue, http://seriss.com/rush/ Tel: (Tel# suppressed) Cel: (Tel# suppressed) Fax: (Tel# suppressed) |