[openib-general] uDAPL open HCA problem
Grant Grundler
iod00d at hp.com
Fri Oct 21 09:23:43 PDT 2005
On Fri, Oct 21, 2005 at 12:17:28PM -0400, Sayantan Sur wrote:
> Hello,
>
> I have udapl over Gen2 setup on our cluster and am able to run udapl
> programs. However, sometimes I get this error (after a few runs of the
> same program):
>
> open_hca: ERR ib_at_ips_by_gid for mthca0
> dapls_ib_open_hca failed 40000
>
> The machine is a AMD Opteron (Tyan S2895), with Mellanox MemFree cards
> (fw ver 5.1.0).
Folks here will still need to know:
1) Which kernel version?
2) Which SVN version of GEN2 are you using?
hth,
grant
>
> lsmod on my machine shows this:
>
> [surs at ro0:~] lsmod | grep ^ib
> ib_ipoib 48008 0
> ib_uat 14840 0
> ib_at 25696 1 ib_uat
> ib_sa 17804 2 ib_ipoib,ib_at
> ib_ucm 22280 0
> ib_cm 37744 1 ib_ucm
> ib_uverbs 35992 0
> ib_umad 18208 0
> ib_mthca 122656 0
> ib_mad 44072 4 ib_sa,ib_cm,ib_umad,ib_mthca
> ib_core 56192 8
> ib_ipoib,ib_sa,ib_ucm,ib_cm,ib_uverbs,ib_umad,ib_mthca,ib_mad
>
> My infiniband devices are (created by hand):
>
> [surs at ro0:~] ls -l /dev/infiniband/
> total 0
> crw-rw-rw- 1 root root 231, 191 2005-10-20 21:13 uat
> crw-rw-rw- 1 root root 231, 224 2005-10-20 21:12 ucm0
> crwxrwxrwx 1 root root 231, 192 2005-09-21 04:37 umad0
> crwxrwxrwx 1 root root 231, 192 2005-09-16 19:29 uverbs0
> crwxrwxrwx 1 root root 231, 192 2005-09-16 19:29 uverbs1
>
>
> I'd really appreciate if someone could help me understand what might be
> going wrong.
>
> Thanks,
> Sayantan.
>
> --
> http://www.cse.ohio-state.edu/~surs
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
More information about the general
mailing list