David B. Zafman
2002-06-10 16:00:13 UTC
Below is something I wrote up last week, but was waiting for Bruce to
comment on it before sending it out. Now that I see what you've done
with /etc/init.d/clusterinit, I thought I'd send this out. I will
examine what you've done today.
Last week I did something similiar. I wasn't as concerned with the
dependent node networking, but I wanted to replace rc.sysinit for
dependent nodes only. I copied the redhat rc.sysinit to
rc.sysinit.nodeup and removed all the things which the dependent nodes
should not be duplicating. I also removed the execution of rc for run
level 3 from rc.nodeup. Keep in mind that only the first booting node
runs rc.sysinit just like base linux. Since only dependent nodes run
rc.nodeup, only the dependent nodes run rc.sysinit.nodeup.
---------
You've brought up an important architectural issue. Once there is a
single root it requires clusterization to have duplicate services
running. One way to clusterize things is adding context dependent links
(i.e. /var/run as you proposed for the *.pid files).
The current set-up of having rc.nodeup call rc.sysinit then running
complete rc 3 runlevel processing was fine when we had a non-shared
root. Now with CFS and GFS we really need to NOT do this. Looking at
rc.sysinit on a redhat install, I see that it does all sorts of stuff
which should NOT be done again on a joining node in the shared-root case.
In a cluster there would generally be two kinds of services. The first
kind is a single instance of the service (single process or set of
processes on one node) running with keepalive to restart it on node
failures. The second kind is the service that is cluster aware, so that
processes could exist on multiple nodes, but they cooperate with each
other. In non-stop clusters we parallelized inetd, for example. It
maintained processes on all nodes, and kept a list of pids which it
updated as nodes came and went.
The whole /var/run/service_name.pid mechanism I would propose is only
used for non-cluster aware serives which are restricted to running on
the root node, but may be restarted on node failure. It is assumed that
to restart the service we might have to remove the .pid file and then on
(re)start the service would create the file again with the new pid.
comment on it before sending it out. Now that I see what you've done
with /etc/init.d/clusterinit, I thought I'd send this out. I will
examine what you've done today.
Last week I did something similiar. I wasn't as concerned with the
dependent node networking, but I wanted to replace rc.sysinit for
dependent nodes only. I copied the redhat rc.sysinit to
rc.sysinit.nodeup and removed all the things which the dependent nodes
should not be duplicating. I also removed the execution of rc for run
level 3 from rc.nodeup. Keep in mind that only the first booting node
runs rc.sysinit just like base linux. Since only dependent nodes run
rc.nodeup, only the dependent nodes run rc.sysinit.nodeup.
---------
You've brought up an important architectural issue. Once there is a
single root it requires clusterization to have duplicate services
running. One way to clusterize things is adding context dependent links
(i.e. /var/run as you proposed for the *.pid files).
The current set-up of having rc.nodeup call rc.sysinit then running
complete rc 3 runlevel processing was fine when we had a non-shared
root. Now with CFS and GFS we really need to NOT do this. Looking at
rc.sysinit on a redhat install, I see that it does all sorts of stuff
which should NOT be done again on a joining node in the shared-root case.
In a cluster there would generally be two kinds of services. The first
kind is a single instance of the service (single process or set of
processes on one node) running with keepalive to restart it on node
failures. The second kind is the service that is cluster aware, so that
processes could exist on multiple nodes, but they cooperate with each
other. In non-stop clusters we parallelized inetd, for example. It
maintained processes on all nodes, and kept a list of pids which it
updated as nodes came and went.
The whole /var/run/service_name.pid mechanism I would propose is only
used for non-cluster aware serives which are restricted to running on
the root node, but may be restarted on node failure. It is assumed that
to restart the service we might have to remove the .pid file and then on
(re)start the service would create the file again with the new pid.
Hi,
I guess we need to have node specific /var/run directory also.
Otherwise on debian some sevices may not come up on node2. They check
/var/run/service_name.pid file to see whether the service is already
running or not.
That make it for debian /etc/init.d/rcS add these lines before doing
the for loop show below
#
# Cluster specific remounts.
#
#
mount --bind /etc/network-`/usr/sbin/clusternode_num` /etc/network
mount --bind /run-`/usr/sbin/clusternode_num` /var/run
#
# Call all parts in order.
#
for i in /etc/rcS.d/S??*
-aneesh
_______________________________________________________________
Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm
_______________________________________________
ssic-linux-devel mailing list
https://lists.sourceforge.net/lists/listinfo/ssic-linux-devel
I guess we need to have node specific /var/run directory also.
Otherwise on debian some sevices may not come up on node2. They check
/var/run/service_name.pid file to see whether the service is already
running or not.
That make it for debian /etc/init.d/rcS add these lines before doing
the for loop show below
#
# Cluster specific remounts.
#
#
mount --bind /etc/network-`/usr/sbin/clusternode_num` /etc/network
mount --bind /run-`/usr/sbin/clusternode_num` /var/run
#
# Call all parts in order.
#
for i in /etc/rcS.d/S??*
-aneesh
_______________________________________________________________
Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm
_______________________________________________
ssic-linux-devel mailing list
https://lists.sourceforge.net/lists/listinfo/ssic-linux-devel
--
David B. Zafman | Hewlett-Packard Company
Linux Kernel Developer | Open SSI Clustering Project
mailto:***@hp.com | http://www.hp.com
"Thus spake the master programmer: When you have learned to snatch
the error code from the trap frame, it will be time for you to leave."
David B. Zafman | Hewlett-Packard Company
Linux Kernel Developer | Open SSI Clustering Project
mailto:***@hp.com | http://www.hp.com
"Thus spake the master programmer: When you have learned to snatch
the error code from the trap frame, it will be time for you to leave."