[openib-general] uDAPL problem

Todd Bowman twbowman at gmail.com
Tue Sep 27 07:48:49 PDT 2005


On 27 Sep 2005 09:55:02 -0400, Hal Rosenstock <halr at voltaire.com> wrote:
>
> On Tue, 2005-09-27 at 09:51, James Lentini wrote:
> > On Mon, 26 Sep 2005, Hal Rosenstock wrote:
> >
> > > On Mon, 2005-09-26 at 18:05, Todd Bowman wrote:
> > > > I am having a problem with uDAPL accessing
> > > > /dev/infiniband/{uat,ucm0}. I am running 3549, 2.6.12 kernel with
> > > > backport. Here is a snippet of the uDAPL debug messages running
> > > > dtest. The dat.conf file seems to be correct, the correclty named
> > > > providers are being loaded.
> > > >
> > > > 26248 Running as server
> > > > DAT Registry: dat_ia_openv (OpenIB-ib0,1:2,0) called
> > > > DAT Registry: IA OpenIB-ib0, trying to load library
> > > > /usr/local/lib/libdapl.so
> > > > libuat: Error <-1:6> couldn't open IB at device
> </dev/infiniband/uat>
> > > > libibcm: error <-1:6> opening device </dev/infiniband/ucm0>
> >
> > This means that the /dev entried are not setup correctly.
>
> Correct. He set this up manually. Todd wrote:
> "I am not running udev but manually create uat and ucm."


The correct major/minor #s fixed that problem.

> > > DAPL: NOT Setting Loopback
> > > > dapl_ib_init:
> > > > DAT Registry: dat_registry_add_provider (OpenIB-ib0,1:2,0)
> > > > dapl_ia_open (OpenIB-ib0, 8, 0x10019d40, 0x10019cc0)
> > > > open_hca: mthca0 - 0x1001fdb0
> > > > open_hca: Found dev mthca0 f422000002c90200
> > > > open_hca: GID subnet 00000000000080fe id f522000002c90200
> > >
> > > These look like they need to be endianized to me.
> >
> > This looks like a bug in the way we print these values out, but I
> > don't think it is the real problem.
>
> Right, it's just a cosmetic with the display.
>
> -- Hal
>
> > What architecture are you using?


Apple G5.

>
> > > > ips_by_gid: ERR ips_by_gid -1 Bad file descriptor
> > > > open_hca: ERR ib_at_ips_by_gid for mthca0
> > > > dapls_ib_open_hca failed 40000
> > > > dapl_ia_open () returns 0x40000
> > > > 26248: Error Adaptor open: DAT_INTERNAL_ERROR
> > > > DAT Registry: Stopped (dat_fini)
> > > > DAPL: Stopped (dapl_fini)
> > > > dapl_ib_release:
> > > >

> > >
> > > > I am not running udev but manually create uat and ucm. Here is the
> > > > list of /dev/infiniband:
> > > >
> > > > ls -l /dev/infiniband/
> > > > total 0
> > > > crw-rw-rw- 1 root root 231, 64 Sep 22 15:18 issm0
> > > > crw-rw-rw- 1 root root 231, 65 Sep 22 15:18 issm1
> > > > crw-rw-rw- 1 root root 231, 254 Sep 22 22:47 uat
> > >
> > > uat is at 231/191.
> > >
> > > > crw-rw-rw- 1 root root 231, 255 Sep 20 22:31 ucm
> > >
> > > I don't think you need this.
> > >
> > > > crw-rw-rw- 1 root root 231, 255 Sep 26 20:01 ucm0
> > >
> > > ucm devices start at 231/224.
> >
> > If these changes do not fix you problem, please let us know.
> >
> > > -- Hal
> > >
> > > > crw-rw-rw- 1 root root 231, 0 Sep 22 15:18 umad0
> > > > crw-rw-rw- 1 root root 231, 1 Sep 22 15:18 umad1
> > > > crw-rw-rw- 1 root root 231, 192 Sep 20 22:30 uverbs0
> > > > crw-rw-rw- 1 root root 231, 193 Sep 20 22:30 uverbs1
> > > >
> > > >
> > > > And the loaded modules:
> > > >
> > > > kdapl_ib 82000 0
> > > > kdapl 14888 1 kdapl_ib
> > > > ib_uverbs 52064 0
> > > > ib_ipoib 65480 0
> > > > ib_ucm 32624 0
> > > > ib_cm 51944 2 kdapl_ib,ib_ucm
> > > > ib_uat 22168 0
> > > > ib_at 34840 2 kdapl_ib,ib_uat
> > > > ib_sa 25328 2 ib_ipoib,ib_at
> > > > ib_mthca 160376 0
> > > > ib_mad 61108 3 ib_cm,ib_sa,ib_mthca
> > > > ib_core 73888 8
> > > > kdapl_ib,ib_uverbs,ib_ipoib,ib_ucm,ib_cm,ib_sa,ib_mthca,ib_mad
> > > >
> > > >
> > > > I am sure that I am missing something simple. Can someone point me
> in
> > > > the right direction.
> > > >
> > > > Thanks,
> > > > Todd
>
> I am having a different problem in ips_by_gid:

open_hca: Found dev mthca0 f422000002c90200
open_hca: GID subnet 00000000000080fe id f522000002c90200
ips_by_gid: ERR ips_by_gid -1 No such device
open_hca: ERR ib_at_ips_by_gid for mthca0
dapls_ib_open_hca failed 40000
dapl_ia_open () returns 0x40000
DT_cs_Server: Could not open OpenIB-ib0 (DAT_INTERNAL_ERROR )

Thanks,
Todd
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20050927/3d785d79/attachment.html>


More information about the general mailing list