***SPAM*** Re: [ofa-general] [NFS/RDMA] Can't mount NFS/RDMA partition

Hal Rosenstock hal.rosenstock at gmail.com
Wed Dec 17 06:43:09 PST 2008


Hi,

On Wed, Dec 17, 2008 at 7:56 AM, Celine Bourde
<celine.bourde at ext.bull.net> wrote:
> Hi,
>
> I can't mount an NFS/RDMA partition.
> I've applied
> http://www.openfabrics.org//downloads/OFED/ofed-1.4/OFED-1.4-docs/nfs-rdma.release-notes.txt
> instructions.
>
> Every steps (loading modules, /etc/exports implementation, starting nfs
> daemon,
> etc..) seems to be ok, but when I do the last command :
> mount -o rdma,port=2050 192.168.0.13:/export /tmp/nfs_client/
> the mount processus blocks even last dmesg output seems correct  :
> "RPC: Registered rdma transport module.
> rpcrdma: connection to 192.168.0.13:2050 on mlx4_0, memreg 5 slots 32 ird 16
> "
> If I try "ibstat" after that, I have a kernel panic message :
> "ibpanic: [4826] main: stat of IB device 'mlx4_0' failed: (Device or
> resource busy)" because device is in use.

That's an application "panic" meaning some sort of abnormal condition.

I'm not familiar with what NFS/RDMA does with the MAD layer but there
may be some conflict with the diagnostic tools in this area. Another
possibility is that the firmware error causes this error condition.

> 100 % of processus is used by ib_mad1

> [root at test]top
> top - 14:55:07 up 19 min,  3 users,  load average: 2.00, 1.87, 1.12
> Tasks: 190 total,   2 running, 188 sleeping,   0 stopped,   0 zombie
> Cpu(s):  0.0%us, 12.5%sy,  0.0%ni, 87.5%id,  0.0%wa,  0.0%hi,  0.0%si,
>  0.0%st
> Mem:   8066156k total,   615096k used,  7451060k free,    45604k buffers
> Swap:  8193140k total,        0k used,  8193140k free,   343436k cached
>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 2952 root      15  -5     0    0    0 R  100  0.0   5:23.55 ib_mad1
>   1 root      20   0 10320  688  572 S    0  0.0   0:02.04 init
>   2 root      15  -5     0    0    0 S    0  0.0   0:00.00 kthreadd
>   3 root      RT  -5     0    0    0 S    0  0.0   0:00.00 migration/0
>   4 root      15  -5     0    0    0 S    0  0.0   0:00.01 ksoftirqd/0
>
>
> I can't kill mount process (kill -9 or shutdown -R or echo b >
> sysrq-trigger)
> and I have to restart the computer using "ipmitool target chassis power
> reset".
>
> Have any idea ?

Is there anything in dmesg or /var/log/messages relating to ib_mad ?

-- Hal

> Moreover, I sometimes have this dmesg log: mlx4_core 0000:01:00.0: HW2SW_MPT
> failed (-16). (I don't think there is an agreement with mount bug). I saw
> this
> error could be occured with old firmeware version but mine is 2.5.9 ..
> For more details see bug report :
> https://bugs.openfabrics.org/show_bug.cgi?id=1459
>
> Thanks for your help.
>
> CĂ©line Bourde.
>
>
>
>
>
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-general
>



More information about the general mailing list