[Users] ibacm?

Hal Rosenstock hal.rosenstock at gmail.com
Wed Oct 25 08:11:35 PDT 2017


I modified the patch to include whether it's IPv6 or IPv4 IOCtl that's
failed:

        ret = ioctl(s, SIOCGIFCONF, ifc);
        if (ret < 0) {
                acm_log(0, "ioctl IPv%s ifconf error: %s\n",
                        (family == AF_INET6) ? "6" : "4", strerror(errno));
                goto out2;
        }

On Wed, Oct 25, 2017 at 10:55 AM, Michael Di Domenico <
mdidomenico4 at gmail.com> wrote:

> while i totally agree with both you and Hal and the fact that this is
> probably a rare issue.  my contention really stems from the point that
> the error message provides no clue that it's an ipv6 error
>
> On Wed, Oct 25, 2017 at 10:46 AM, Weiny, Ira <ira.weiny at intel.com> wrote:
> > Agreed.  I don’t know of many people who completely disable IPv6.  So
> this
> > should be rare.  And if they do then they should know that they will get
> > AF_INET6 errors on any software which is trying to support both…
> >
> >
> >
> > From: Hal Rosenstock [mailto:hal.rosenstock at gmail.com]
> > Sent: Wednesday, October 25, 2017 5:56 AM
> > To: Michael Di Domenico <mdidomenico4 at gmail.com>
> > Cc: Weiny, Ira <ira.weiny at intel.com>; users at lists.openfabrics.org
> > Subject: Re: [Users] ibacm?
> >
> >
> >
> > Right now, I don't see obvious way to eliminate that message for
> AF_INET6 as
> > sometimes it's appropriate and other times not.
> >
> >
> >
> > On Wed, Oct 25, 2017 at 8:44 AM, Michael Di Domenico
> > <mdidomenico4 at gmail.com> wrote:
> >
> > the patch does work.  the only complaint i might register is that the
> > first time through the loop you still get an ifconf ioctl error in the
> > log file.  it does proceed along to bind to the AF_INET afterwards,
> > but it might cause a red herring support issue later on
> >
> >
> > On Tue, Oct 24, 2017 at 5:30 PM, Hal Rosenstock
> > <hal.rosenstock at gmail.com> wrote:
> >> I have supplied a patch for Michael to try and will submit it to
> rdma-core
> >> once tested...
> >>
> >> On Tue, Oct 24, 2017 at 4:38 PM, Weiny, Ira <ira.weiny at intel.com>
> wrote:
> >>>
> >>> We probably need to fall back to an AF_INET check.  I did not realize
> >>> that
> >>> disabling ipv6 would cause this to fail because AF_INET6 usually
> “covers”
> >>> AF_INET.  My guess here is that if you excluded IPv6 support from the
> >>> kernel
> >>> that would explain the failure.
> >>>
> >>>
> >>>
> >>> Perhaps you could try a patch which falls back to AF_INET?
> >>>
> >>>
> >>>
> >>> Ira
> >>>
> >>>
> >>>
> >>> From: Users [mailto:users-bounces at lists.openfabrics.org] On Behalf Of
> Hal
> >>> Rosenstock
> >>> Sent: Tuesday, October 24, 2017 8:59 AM
> >>> To: Michael Di Domenico <mdidomenico4 at gmail.com>
> >>> Cc: users at lists.openfabrics.org
> >>> Subject: Re: [Users] ibacm?
> >>>
> >>>
> >>>
> >>> This makes more sense ;-)
> >>>
> >>>
> >>>
> >>> What were the steps used to disable ipv6 ?
> >>>
> >>>
> >>>
> >>> On Tue, Oct 24, 2017 at 11:16 AM, Michael Di Domenico
> >>> <mdidomenico4 at gmail.com> wrote:
> >>>
> >>> it looks like i found the culprit
> >>>
> >>> when running ibacm out of the box on rhel 7.4 the ibacm.log shows
> >>>
> >>> acm_if_iter_sys: ioctl ifconf error -1
> >>>
> >>> if i change the line
> >>>
> >>> s = socket(AF_INET6, SOCK_DGRAM, 0);
> >>>
> >>> to
> >>>
> >>> s = socket(AF_INET, SOCK_DGRAM, 0);
> >>>
> >>> as Hal suggested and start the ibacm daemon i acm correctly bind to be
> >>> ipoib addresses and interfaces.  my initial report of this change not
> >>> being effective was a miss communication in that i though it related
> >>> to the client and not the service process
> >>>
> >>> i can then run
> >>>
> >>> ib_acme -f i -s 172.22.64.96 -d 172.22.64.96 -S 172.22.64.96 -v V
> >>>
> >>> and get back valid data
> >>>
> >>> ib_acme -d <hostname> still doesn't work, but that might be internal
> >>> we don't currently have reverse/forward entries for our ipoib
> >>> interfaces, i'm still looking into it
> >>>
> >>>
> >>>
> >>>
> >>> On Mon, Oct 16, 2017 at 1:54 PM, Michael Di Domenico
> >>> <mdidomenico4 at gmail.com> wrote:
> >>> > it's come to my attention that ibacm might not be working correctly
> on
> >>> > my cluster, but i'm unable to determine why ibacm is failing
> >>> >
> >>> > here's what i did
> >>> >
> >>> > ib_acme -A -O
> >>> > systemctl restart ibacm
> >>> >
> >>> > in the /var/log/ibacm file i see
> >>> >
> >>> > acm_if_iter_sys: ioctl ifconf error -1
> >>> > acmp_join_group: qib0 1 pkey 0xffff, sl 0x0, rate 0x3, mtu 0x4
> >>> > acm_server: started
> >>> >
> >>> > but when i try to query the node locally
> >>> >
> >>> > ib_acme -d node001 -v -V
> >>> >
> >>> > in the log file i see
> >>> >
> >>> > acm_svr_resolve_dest: notice - unknown local end point address
> >>> >
> >>> > on the console i see
> >>> >
> >>> > Service: localhost
> >>> > Destination: 172.21.80.1
> >>> > ib_acm_resolve_ip failed: cannot assign requested address
> >>> > SA verification: failed cannot assign requested address
> >>> >
> >>> > the ibacm_addr.cfg contains
> >>> > node001 qib0 1 default
> >>> > node001-1 qib0 1 default
> >>> >
> >>> > all the nodes in the cluster are configured the exact same way.  and
> >>> > produce the same result when trying to query locally or a remote node
> >>> >
> >>> > any thoughts?
> >>> _______________________________________________
> >>> Users mailing list
> >>> Users at lists.openfabrics.org
> >>> http://lists.openfabrics.org/mailman/listinfo/users
> >>>
> >>>
> >>
> >>
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/users/attachments/20171025/a81a018b/attachment.html>


More information about the Users mailing list