***SPAM*** Re: [ofa-general] [NFS/RDMA] Can't mount NFS/RDMA partition
Hal Rosenstock
hal.rosenstock at gmail.com
Wed Dec 17 06:43:09 PST 2008
Hi,
On Wed, Dec 17, 2008 at 7:56 AM, Celine Bourde
<celine.bourde at ext.bull.net> wrote:
> Hi,
>
> I can't mount an NFS/RDMA partition.
> I've applied
> http://www.openfabrics.org//downloads/OFED/ofed-1.4/OFED-1.4-docs/nfs-rdma.release-notes.txt
> instructions.
>
> Every steps (loading modules, /etc/exports implementation, starting nfs
> daemon,
> etc..) seems to be ok, but when I do the last command :
> mount -o rdma,port=2050 192.168.0.13:/export /tmp/nfs_client/
> the mount processus blocks even last dmesg output seems correct :
> "RPC: Registered rdma transport module.
> rpcrdma: connection to 192.168.0.13:2050 on mlx4_0, memreg 5 slots 32 ird 16
> "
> If I try "ibstat" after that, I have a kernel panic message :
> "ibpanic: [4826] main: stat of IB device 'mlx4_0' failed: (Device or
> resource busy)" because device is in use.
That's an application "panic" meaning some sort of abnormal condition.
I'm not familiar with what NFS/RDMA does with the MAD layer but there
may be some conflict with the diagnostic tools in this area. Another
possibility is that the firmware error causes this error condition.
> 100 % of processus is used by ib_mad1
> [root at test]top
> top - 14:55:07 up 19 min, 3 users, load average: 2.00, 1.87, 1.12
> Tasks: 190 total, 2 running, 188 sleeping, 0 stopped, 0 zombie
> Cpu(s): 0.0%us, 12.5%sy, 0.0%ni, 87.5%id, 0.0%wa, 0.0%hi, 0.0%si,
> 0.0%st
> Mem: 8066156k total, 615096k used, 7451060k free, 45604k buffers
> Swap: 8193140k total, 0k used, 8193140k free, 343436k cached
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 2952 root 15 -5 0 0 0 R 100 0.0 5:23.55 ib_mad1
> 1 root 20 0 10320 688 572 S 0 0.0 0:02.04 init
> 2 root 15 -5 0 0 0 S 0 0.0 0:00.00 kthreadd
> 3 root RT -5 0 0 0 S 0 0.0 0:00.00 migration/0
> 4 root 15 -5 0 0 0 S 0 0.0 0:00.01 ksoftirqd/0
>
>
> I can't kill mount process (kill -9 or shutdown -R or echo b >
> sysrq-trigger)
> and I have to restart the computer using "ipmitool target chassis power
> reset".
>
> Have any idea ?
Is there anything in dmesg or /var/log/messages relating to ib_mad ?
-- Hal
> Moreover, I sometimes have this dmesg log: mlx4_core 0000:01:00.0: HW2SW_MPT
> failed (-16). (I don't think there is an agreement with mount bug). I saw
> this
> error could be occured with old firmeware version but mine is 2.5.9 ..
> For more details see bug report :
> https://bugs.openfabrics.org/show_bug.cgi?id=1459
>
> Thanks for your help.
>
> CĂ©line Bourde.
>
>
>
>
>
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-general
>
More information about the general
mailing list