RUSH RENDER QUEUE
(C) Copyright 1995,2000 Greg Ercolano. All rights reserved.
V 102.30 04/25/01

Strikeout text indicates features not yet implemented

Systems Administrator Questions

I'm getting "connect(myhost): Invalid argument" in my linux machine's rushd.log?
What does 'rresvport(): Permission denied' mean?
What does 'bind(): Address already in use' mean?
What's the best way to verify all the daemons are running?
How do I stop/start the daemons? (Unix/NT)
Is there an example boot script I can use to invoke rush?
Is there a way to run 'rush -online' automatically when someone logs out?
Is there a way to run 'rush -online' automatically when someone's screensaver pops on?
What kinds of security issues are there with rush?
How do I update changes to the rush hosts file (or rush.conf file) to the network?
Is there a way to track whose jobs are bumping whom?
Is there a way to track who's changing other people's jobs?
Can rush be told to use a different network interface, other than the machine's hostname?
Windows: Where can I get perl for Windows?
Windows: Where can I get an rsh/telnet daemons for Windows?
Windows: is there a way to restart the rushd service as a normal user?
Windows: Is there a way to disable error dialogs? (General Protection Faults, etc)
I changed the ip addresses of a host on my network, and now rush can't talk to it?
Sometimes I see "???.???.???.???" in 'rush -lah' reports. Is this bad?
Rush is acting slow; reports take a long time, and the GUI is sluggish. What's wrong?
Can I use DHCP on machines running rush?

I'm getting "connect(myhost): Invalid argument" in my linux machine's rushd.log?

This is known problem with default installations of linux RedHat (7.1, etc) caused by the installer.

To test for the problem, open a shell on that machine and ping its own hostname. If the address returned is 127.x.x.x, fix your /etc/hosts file.


     you@tahoe % ping tahoe
     PING tahoe (127.0.0.1) from 127.0.0.1 : 56(84) bytes of data.
     64 bytes from 127.0.0.1: icmp_seq=0 ttl=255 time=0.5 ms
     64 bytes from 127.0.0.1: icmp_seq=1 ttl=255 time=0.4 ms
     ^C

     you@tahoe % grep tahoe /etc/hosts
     127.0.0.1 tahoe localhost


     root@tahoe # vi /etc/hosts
     root@tahoe # cat /etc/hosts
     127.0.0.1      localhost
     192.168.0.37   tahoe
     
     root@tahoe # ping tahoe
     PING tahoe (192.168.0.37) from 192.168.0.37 : 56(84) bytes of data.
     64 bytes from 192.168.0.37: icmp_seq=0 ttl=255 time=0.5 ms
     64 bytes from 192.168.0.37: icmp_seq=1 ttl=255 time=0.5 ms

What does 'rresvport(): Permission denied' mean?

	 % rush -ping
         tahoe: rush: rresvport(): Permission denied

Make sure the SUID bit is on for the rush(1) binary, and the owner is root:

	 chmod 4755 /usr/local/rush/bin/rush
	 chown 0.0  /usr/local/rush/bin/rush

What does 'bind(): Address already in use' mean?

It usually means one of these things.

You recently stop/started the daemon. Problem goes away by itself.
Two or more rushd's are running.
Something else is using the port.
The kernel is using the port, probably NFS.

#1 is the most common, and occurs when you stop/start the rushd daemon. This problem fixes itself within 2 minutes automatically. The OS often keeps recently closed TCP listeners unavailable to other processes for a 90 second period. Rush will keep retrying to bind to the port, and eventually succeeds within 2 minutes.

#2 shows where 'rush -ping' works, but bind() errors keep appearing in the logs. 'ps -elf | grep rushd' will show more than one rushd running; only one daemon should have a PPID of 1 (Parent Process ID). If there's more than one with a PPID of 1, kill the one(s) with the higher PID.

#3 Stop the rush daemon, and use 'netstat -an' to see if some other program is using rush's port (normally port #696; see your rush.conf file's serverport setting, incase your site uses a different port number). Look for open UDP or TCP connections on that port, either in the Local or Foreign address.

If you see port #696 in the 'Foreign' address of the local machine, suspect hung clients on the remotes:

rsh over to the remote machine (ie. 'Foreign' host)
Kill any 'rush' client processes you see, eg. 'killall rush'
Back on the local machine, do a 'netstat -an' to verify the connections are gone or closing.

Restart the daemon once all 696 ports have closed

If the local TCP or UDP port is in use, suspect some system daemon or other is using the port when it shouldn't. Use fuser(1) or similar utility to figure out which process is using the port, or simply reboot.

If fuser(1) shows no process and it's an SGI, then see #4..

#4 often occurs if you've just installed rush on an SGI for the first time, and the machine has been up for a while. 'netstat -an' will show a whole slew of UDP listeners on ports between 512 and 1024 all in sequence, one of them being port number 696, the one rush has been assigned by IANA. Some rogue kernel utility is causing this, probably NFS. Usually fuser(1) shows no process associated with the rogue UDP listeners because it's a kernel process. The easiest solution is to simply reboot; when rush starts on boot, it always secures the port it needs well before the kernel gets a chance to step on it.

What's the best way to verify all the daemons are running?

rush -ping +any

This 'pings' all the daemons in the $RUSH_DIR/etc/hosts file with a TCP message.

If the daemon isn't running, tail(1) the daemon's log file in $RUSH_DIR/var/rushd.log.

How do I stop/start the daemons? (Unix/NT)

Irix	`/etc/init.d/rush stop /etc/init.d/rush start`
Linux/RedHat 6.x	`/etc/rc.d/init.d/rush stop /etc/rc.d/init.d/rush start`
Windows NT	`net stop rushd net start rushd`

All the daemons can be stopped via:

rush -dexit +any

Is there an example boot script I can use to invoke rush?

$RUSH_DIR/etc/S99rush

Is there a way to run 'rush -online' automatically when someone logs out?

Irix	`/usr/lib/X11/xdm/Xreset`
Linux/RedHat 6.x	`/etc/X11/xdm/TakeConsole`

A literal example of what should be added to these files would be:

/usr/local/rush/bin/rush -online logger -t RUSH "Rush online (user logout)"

Use of logger(1) is optional; it leaves an audit trail in the syslog. Include full path to logger(1) if security is an issue.

Is there a way to run 'rush -online' automatically when someone's screensaver pops on?

There probably is, but I don't know how to do it.

If you have any suggestions on how to do it on various platforms, please send me email.

What kinds of security issues are there with rush??

To avoid root loopholes, be sure all subdirs in the path to the setuid binaries and config files have tight permissions, e.g., if rush is installed in /usr/local/rush/bin:


    chmod go-w /usr \
	       /usr/local \
	       /usr/local/rush \
	       /usr/local/rush/bin \
	       /usr/local/rush/bin/* \
	       /usr/local/rush/etc \
	       /usr/local/rush/var \
	       /usr/local/rush/var/*

    chmod 4755 /usr/local/rush/bin/rush
    chmod  755 /usr/local/rush/bin/rushd

    chown 0.0 /usr/local/rush/bin/rush \
	      /usr/local/rush/bin/rushd

By default, rush uses reserved port 696 to communicate udp/tcp packets. For secure networks, make sure users do not have access to root to avoid renegade software from exploiting the port.

Rush daemons will not run any job as a uid or gid less than 100. You can further restrict which uids/gids rush can run processes as via UidRange and GidRange or even ForceUid/ForceGid.
Rush daemons will only trust remote machines that are configured in its host list. Rush will log all connection attempts from machines not configured in the hosts file. Sysadmins can grep the rushd.log files for the string 'SECURITY' to detect security related problems.
The new 'rush -push' feature which helps sysadmins release the rush hosts/rush.conf/license.dat files can be disabled to close any suspected loop holes with such file distribution.

How do I update changes to the rush hosts file (or rush.conf file) to the network?

You should use rdist(1), and the changed files will be picked up automatically by the daemons within a minute. Here are some examples:


	# SEND A NEW rush.conf
	foreach i ( `awk '/^[a-z]/{print $1}' /usr/local/rush/etc/hosts` )
   	rdist -c /usr/tmp/newconf ${i}:/usr/local/rush/etc/rush.conf
	end
	# SEND A NEW RUSH hosts
	foreach i ( `awk '/^[a-z]/{print $1}' /usr/tmp/newhosts` )
   	rdist -c /usr/tmp/newhosts ${i}:/usr/local/rush/etc/hosts
	end

NOTE: When sending out new files, you must use rdist(1), and not cp(1) or rcp(1). rdist(1) uses a special 'tmp-file/rename' technique that prevents the daemon from parsing the file before it has finished being written.

Is there a way to track whose jobs are bumping whom?

Grep the $RUSH_DIR/var/rushd.log file for BUMP messages.

Is there a way to track who's changing other people's jobs?

Grep the $RUSH_DIR/var/rushd.log file for SECURITY messages.

Can rush be told to use a different network interface, other than the machine's hostname?

Yes. In the rush hostlist, the hostname can actually be a pair of hostnames separated by a ':', e.g., tahoe:tahoe-eth.

The name on the left of the ':' is the familiar hostname(1) of the machine, and the name that follows the ':' is the alternate network interface you want to use.

See also the Hosts File: Hostname section on the hostname field.

Where should I get perl for Windows?

It is highly recommended you use

ActiveState Perl

It's definitely the best. Both well integrated and documented specifically for the Windows platform. Highly cross-platform compatible, with excellent Windows-specific modules and many of the standard internet modules, including Mail/FTP/NNTP, etc.

I've personally tested and used it extensively in various production environments and have found it to be the most stable perl available.

It's a free download.

Where should can get an rsh(1)/rcp(1) daemon for Windows?

Denicomp - has rsh(1) and rcp(1) daemons
Georgia SoftWorks - has an excellent telnet daemon

Regarding Denicomp and rsh, it lets you run simple commands on the remote machines using NT's own rsh(1) client. It supports 'rsh hostname command', but does not support 'rush hostname'. In other words, you can't strike up an interactive session. You get a limited trial to use the software for free, then if you like it you should buy it.

Regarding Georgia Softwork's telnet server, I have to say it's impressive what it does. You can run interactive dos applications that even do direct screen memory access, and the results will look correct on the telnet client. Compatible with unix telnet clients, as with NT clients. Unfortunately, the software is very expensive. But you get a 30 day trial to test it out.

There is also freeware available. Most of those I've evaled have extreme limitations, or are easily broken.

The NT Resource Kit from Microsoft which comes with a telnet server, though I've never tried it.

Windows: is there a way to restart the rushd service as a normal user?

I. If you're running under Win2K, you can use the new 'runas' Win2K command. Similar to su(1) in unix; it lets you run commands as administrator. The following gives you a DOS shell with network administrator priveleges regardless of who the current logged in user is:

runas /user:YOUR_DOMAIN\Administrator cmd Password:

net stop rushd

net start rushd

II. Use your domain controller's Remote Services administration software. With a Win2K server:

Start->Programs->Administrative Tools->Active Directory Users And Computers
Select 'Computers'
From the list of computers, right click on the one to control, and choose 'Manage'
Under the "Tree" tab, click "Services and Applications"
Choose "Services", then choose "Rushd" and then use the usual Start/Stop controls

There's surely something similar under WinNT Server.

Windows: Is there a way to disable error dialogs? (General Protection Faults, etc)

Yes. But it is a registry tweak that affects the entire machine.

See this Microsoft Knowlege Base Article Q124873 for more info. To paraphrase, this article basically says, along with the usual risk disclaimers regarding manual editing of the registry:

Run Registry Editor (REGEDT32.EXE).
From the HKEY_LOCAL_MACHINE subtree, go to the following key:
\SYSTEM\CurrentControlSet\Control\Windows\ErrorMode
Select the ErrorMode value.
From the Edit menu, choose DWORD.
Type 0 (zero), 1, or 2 to select the error mode. Regardless of this setting, all errors are written to the system log:
- 0 - Error message box pops up (default).
- 1 - No dialog for system errors only.
- 2 - No dialog for system or other errors.

I changed the ip addresses of a host on my network, and now rush can't talk to it?

For speed, the rush daemons cache hostname-to-ip-address lookups for all the hosts in the rush hostlist. This prevents load on your DNS, NIS and WINS servers, since rush makes numerous hostname/ip lookups when it's running jobs.

When you change the IP address of one of the machines, the rush daemons need to be told to flush their caches. touch(1)ing all the $RUSH_DIR/etc/hosts files changes the date stamp of the file, causing the daemons to think a change was made, which then reload the file, and flush their cache.

You can check to see what any daemon has in its IP cache using 'rush -lah <hostname>'. This will show you the rush hostlist according to the daemon on the named machine, including it's cached IP address lookup information.

Sometimes I see "???.???.???.???" in 'rush -lah' reports. Is this bad?

Rush will not operate correctly if it can't do hostname lookups for any machine in the rush hosts file.

The question marks mean rush is unable to lookup a host's name. The more of these there are, the slower rush will operate. You will also notice sluggish or very slow operation in the GUIs, and in the generation of most rush reports.

Probably other tools like 'ping' will be unable to lookup the hostname. Possible causes:

In an environment where static /etc/hosts files are used for name lookups, make sure the hostname is in all your /etc/hosts files.
In an NIS environment, same problem likely exists with your 'hosts' map.
Do not use DHCP. DHCP is bad for machines running rush. Use static IPs.
In a DNS environment, either your DNS server is not responding, not configured, or the host is not in your DNS. Use nslookup(1) to debug the problem.
In a Windows envrionment where WINS is used to do hostname lookups, if the machine is down, WINS can't do a hostname lookup for it. To solve this problem, you can do any *one* of the following:
- Make *static* IP entries for all rush machines on your WINS server, so the hostnames still lookup even if the machine is down. Use the 'WINS Manager' on your PDC.
- Maintain static hosts files on the rush machines. Windows has a unix-like hosts file "C:\WINNT\SYSTEM32\DRIVERS\ETC\HOSTS" which if set up to contain IP-to-hostname entries for all the rush hosts, it can be copied to all the machines to ensure hostname lookups never fail.
- Get away from WINS, and use DNS with static IP-to-hostname lookups. This will ensure hostname lookups work even when hosts are down.

Rush is acting slow; reports take a long time, and the GUI is sluggish. What's wrong?

Run 'rush -lah' and 'rush -lah localhost' to see if it reports "???.???.???.???" for the ip address of any hostnames. If so, you are having a name lookup problem, and that's causing the problem.

This is especially a problem on Windows networks if you use WINS instead of DNS. WINS can't do hostname lookups for a machine that is down. A good reason NOT to be lazy, and depend on WINS to dynamically keep track of things.

To solve this problem, see the above.

Can I use DHCP on machines running rush?

You can do it, only if the leases are set to never expire.

Rush will not operate correctly if the IP address of machines change randomly, or change when they reboot. The best thing to do is assign static ip addresses to all machines running rush.