Brian J. Watson
2004-08-12 00:29:10 UTC
Hi Aneesh,
Good news!! I found the deadlock that was preventing the second node
from joining. Due to a locking change in flush_signals(), the icssvr
daemon was attempting to grab the same spinlock twice with interrupts
disabled. I removed the extraneous spinlock code from the CI wrapper
function, and a two-node CI cluster now works! I even did a couple of
CLMS master failovers.
The modified patch is available at
http://ci-linux.sourceforge.net/contrib/ci-2.6.6.ak.aug11.patch.bz2
I still need to review the changes you made to get CI/2.6 to build. I'll
be on vacation until next Tuesday, so I'll review them after I get back.
Then I'll create a new branch in the CI repository for this stuff.
Best regards,
Brian
Good news!! I found the deadlock that was preventing the second node
from joining. Due to a locking change in flush_signals(), the icssvr
daemon was attempting to grab the same spinlock twice with interrupts
disabled. I removed the extraneous spinlock code from the CI wrapper
function, and a two-node CI cluster now works! I even did a couple of
CLMS master failovers.
The modified patch is available at
http://ci-linux.sourceforge.net/contrib/ci-2.6.6.ak.aug11.patch.bz2
I still need to review the changes you made to get CI/2.6 to build. I'll
be on vacation until next Tuesday, so I'll review them after I get back.
Then I'll create a new branch in the CI repository for this stuff.
Best regards,
Brian