|
This is known problem with default installations of linux RedHat (7.1, etc) caused by the installer.
To test for the problem, open a shell on that machine and ping its own hostname. If the address returned is 127.x.x.x, fix your /etc/hosts file.
you@tahoe % ping tahoe PING tahoe (127.0.0.1) from 127.0.0.1 : 56(84) bytes of data. 64 bytes from 127.0.0.1: icmp_seq=0 ttl=255 time=0.5 ms 64 bytes from 127.0.0.1: icmp_seq=1 ttl=255 time=0.4 ms ^C you@tahoe % grep tahoe /etc/hosts 127.0.0.1 tahoe localhost..this is wrong. Correct it by editing /etc/hosts, and making separate entries for 'localhost' and the machine's actual hostname and IP address, eg:
root@tahoe # vi /etc/hosts root@tahoe # cat /etc/hosts 127.0.0.1 localhost 192.168.0.37 tahoe root@tahoe # ping tahoe PING tahoe (192.168.0.37) from 192.168.0.37 : 56(84) bytes of data. 64 bytes from 192.168.0.37: icmp_seq=0 ttl=255 time=0.5 ms 64 bytes from 192.168.0.37: icmp_seq=1 ttl=255 time=0.5 ms
% rush -ping tahoe: rush: rresvport(): Permission deniedRush uses a reserved port to communicate with the daemon, and therefore needs to run SUID root.
Make sure the SUID bit is on for the rush(1) binary, and the owner is root:
chmod 4755 /usr/local/rush/bin/rush chown 0.0 /usr/local/rush/bin/rush
#1 is the most common, and occurs when you stop/start the rushd daemon. This problem fixes itself within 2 minutes automatically. The OS often keeps recently closed TCP listeners unavailable to other processes for a 90 second period. Rush will keep retrying to bind to the port, and eventually succeeds within 2 minutes.
#2 shows where 'rush -ping' works, but bind() errors keep appearing in the logs. 'ps -elf | grep rushd' will show more than one rushd running; only one daemon should have a PPID of 1 (Parent Process ID). If there's more than one with a PPID of 1, kill the one(s) with the higher PID.
#3 Stop the rush daemon, and use 'netstat -an' to see if some other program is using rush's port (normally port #696; see your rush.conf file's serverport setting, incase your site uses a different port number). Look for open UDP or TCP connections on that port, either in the Local or Foreign address.
If you see port #696 in the 'Foreign' address of the local machine, suspect hung clients on the remotes:
If the local TCP or UDP port is in use, suspect some system daemon or other is using the port when it shouldn't. Use fuser(1) or similar utility to figure out which process is using the port, or simply reboot.
If fuser(1) shows no process and it's an SGI, then see #4..
#4 often occurs if you've just installed rush on an SGI for the first time, and the machine has been up for a while. 'netstat -an' will show a whole slew of UDP listeners on ports between 512 and 1024 all in sequence, one of them being port number 696, the one rush has been assigned by IANA. Some rogue kernel utility is causing this, probably NFS. Usually fuser(1) shows no process associated with the rogue UDP listeners because it's a kernel process. The easiest solution is to simply reboot; when rush starts on boot, it always secures the port it needs well before the kernel gets a chance to step on it.
This 'pings' all the daemons in the $RUSH_DIR/etc/hosts file with a TCP message.
If the daemon isn't running, tail(1) the daemon's log file in $RUSH_DIR/var/rushd.log.
Irix | /etc/init.d/rush stop /etc/init.d/rush start |
Linux/RedHat 6.x | /etc/rc.d/init.d/rush stop /etc/rc.d/init.d/rush start |
Windows NT | net stop rushd net start rushd |
All the daemons can be stopped via:
Irix | /usr/lib/X11/xdm/Xreset |
Linux/RedHat 6.x | /etc/X11/xdm/TakeConsole |
A literal example of what should be added to these files would be:
Use of logger(1) is optional; it leaves an audit trail in the syslog. Include full path to logger(1) if security is an issue.
If you have any suggestions on how to do it on various platforms, please send me email.
chmod go-w /usr \ /usr/local \ /usr/local/rush \ /usr/local/rush/bin \ /usr/local/rush/bin/* \ /usr/local/rush/etc \ /usr/local/rush/var \ /usr/local/rush/var/* chmod 4755 /usr/local/rush/bin/rush chmod 755 /usr/local/rush/bin/rushd chown 0.0 /usr/local/rush/bin/rush \ /usr/local/rush/bin/rushd |
NOTE:
When sending out new files, you must use rdist(1),
and not cp(1) or rcp(1). rdist(1) uses a special 'tmp-file/rename'
technique that prevents the daemon from parsing the file before
it has finished being written.
The name on the left of the ':' is the familiar hostname(1) of the machine,
and the name that follows the ':' is the alternate network interface you want
to use.
See also the Hosts File: Hostname section
on the hostname field.
It's definitely the best. Both well integrated and documented specifically
for the Windows platform. Highly cross-platform compatible, with excellent
Windows-specific modules and many of the standard internet modules,
including Mail/FTP/NNTP, etc.
I've personally tested and used it extensively in various production
environments and have found it to be the most stable perl available.
It's a free download.
Regarding Denicomp and rsh, it lets you run simple commands on the remote machines
using NT's own rsh(1) client. It supports 'rsh hostname command', but does not
support 'rush hostname'. In other words, you can't strike up an interactive session.
You get a limited trial to use the software for free, then if you like it you
should buy it.
Regarding Georgia Softwork's telnet server, I have to say it's impressive
what it does. You can run interactive dos applications that even do direct
screen memory access, and the results will look correct on the telnet client.
Compatible with unix telnet clients, as with NT clients. Unfortunately, the
software is very expensive. But you get a 30 day trial to test it out.
There is also freeware available. Most of those I've evaled have extreme
limitations, or are easily broken.
The NT Resource Kit from Microsoft which comes with a telnet server,
though I've never tried it.
How do I update changes to the rush hosts file (or rush.conf file)
to the network?You should use rdist(1), and the changed files will be picked up
automatically by the daemons within a minute. Here are some examples:
# SEND A NEW rush.conf
foreach i ( `awk '/^[a-z]/{print $1}' /usr/local/rush/etc/hosts` )
rdist -c /usr/tmp/newconf ${i}:/usr/local/rush/etc/rush.conf
end
# SEND A NEW RUSH hosts
foreach i ( `awk '/^[a-z]/{print $1}' /usr/tmp/newhosts` )
rdist -c /usr/tmp/newhosts ${i}:/usr/local/rush/etc/hosts
end
Is there a way to track whose jobs are bumping whom?
Grep the $RUSH_DIR/var/rushd.log file for BUMP messages.
Is there a way to track who's changing other people's jobs?
Grep the $RUSH_DIR/var/rushd.log file for SECURITY messages.
Can rush be told to use a different network interface, other than the machine's hostname?
Yes. In the rush hostlist, the hostname can actually be a pair of hostnames
separated by a ':', e.g., tahoe:tahoe-eth.
Where should I get perl for Windows?
It is highly recommended you use
ActiveState Perl.
Where should can get an rsh(1)/rcp(1) daemon for Windows?
I have personally evaled both products, and found them both useful.
I. If you're running under Win2K, you can use the new 'runas' Win2K command. Similar to su(1) in unix; it lets you run commands as administrator. The following gives you a DOS shell with network administrator priveleges regardless of who the current logged in user is:
II. Use your domain controller's Remote Services administration software. With a Win2K server:
There's surely something similar under WinNT Server.
See this Microsoft Knowlege Base Article Q124873 for more info. To paraphrase, this article basically says, along with the usual risk disclaimers regarding manual editing of the registry:
For speed, the rush daemons cache hostname-to-ip-address lookups for all the hosts in the rush hostlist. This prevents load on your DNS, NIS and WINS servers, since rush makes numerous hostname/ip lookups when it's running jobs.
When you change the IP address of one of the machines, the rush daemons need to
be told to flush their caches. touch(1)ing all the
You can check to see what any daemon has in its IP cache using 'rush -lah <hostname>'.
This will show you the rush hostlist according to the daemon on the named machine,
including it's cached IP address lookup information.
Rush will not operate correctly if it can't do hostname lookups for any machine in the rush hosts file.
The question marks mean rush is unable to lookup a host's name. The more of these there are, the slower rush will operate. You will also notice sluggish or very slow operation in the GUIs, and in the generation of most rush reports.
Probably other tools like 'ping' will be unable to lookup the hostname. Possible causes:
This is especially a problem on Windows networks if you use WINS instead of DNS. WINS can't do hostname lookups for a machine that is down. A good reason NOT to be lazy, and depend on WINS to dynamically keep track of things.
To solve this problem, see the above.
You can do it, only if the leases are set to never expire.
Rush will not operate correctly if the IP address of machines change randomly, or change when they reboot. The best thing to do is assign static ip addresses to all machines running rush.