[openfabrics-ewg] IPoIB HA not working on RHEL4 U3
John Blackwood
john.blackwood at ccur.com
Mon Sep 25 11:17:29 PDT 2006
Hi Michael,
Thank you for the info/update.
While we're on the topic of IPoIBHA, I did a bit of testing on
RHEL4 U2 and SLES 10 systems, and I noticed the two following issues.
Maybe you have noticed these same issues too ?
1. If the port goes down, then ipoib_ha.pl will start up mcasthandle.
If you then do a 'service openibd restart' then the mcasthandle
process is not killed off. Thus, you can end up with 2 copies of
mcasthandle. It would be nice if the 'stop' portion of openibd
could kill off mcasthandle if it is running.
2. In OFED-1.1-rc5, the ipoib_ha.pl script would set the down port's
IP address to zero. In OFED-1.1-rc6, the script instead marks
the interface 'down'.
I've noticed that in rc5, when the primary port went down, the
2ndary IP address became unusable.
In rc6, when the primary port goes down, the 2ndary IP address
is still usable, but now I notice that every 8 seconds or so,
traffic across either IP address will pause for a bit.
If the port comes back up, and I restart openibd, then this
pausing behavior goes away.
I was just wondering if you are also seeing this behavior ?
thanks for listening.
John Blackwood
john.blackwood at ccur.com
Concurrent Computer Corp.
Michael S. Tsirkin wrote:
> Quoting r. John Blackwood <john.blackwood at ccur.com>:
>
>>Subject: Re: IPoIB HA not working on RHEL4 U3
>>
>>Just FYI,
>>
>>I was also having problems with IPoIBHA on RHEL4 U2.
>>
>>I found that the iproute-2.6.9-3 rpm did not output the NO-CARRIER
>>state, but when I installed iproute-2.6.11-1, the NO-CARRIER state did
>>appear, and IPoIBHA started working much better.
>
>
> Correct. What we are doing is packaging a version of iproute into ofed
> for use solely by ipoibha. This will then be used whenever
> iproute is older than 2.6.9.
>
>
More information about the ewg
mailing list