[Fwd: Re: [ofa-general] [NFS/RDMA] Can't mount NFS/RDMA partition]]
Diego Moreno
Diego.Moreno-Lazaro at bull.net
Tue Apr 28 05:45:01 PDT 2009
Hi,
I'm working with Celine trying to make NFS RDMA work. We installed a new
firmware (2.6.636). We still have the problem but now we have more
information on client side.
- With the workaround (memreg 6) we can mount without any problem. We
can read a file but if we try to create a file with dd, application
hangs and then we have to do 'umount -f'. There is no message on server.
Message on client:
rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 6 slots 32
ird 16
rpcrdma: connection to 192.168.0.215:2050 closed (-103)
- With fast registration:
There is no message on server. dmesg client output with fast registration:
rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 5 slots 32
ird 16
rpcrdma: connection to 192.168.0.215:2050 closed (-103)
rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 5 slots 32
ird 16
------------[ cut here ]------------
WARNING: at kernel/softirq.c:136 local_bh_enable_ip+0x3c/0x92()
Modules linked in: xprtrdma autofs4 hidp nfs lockd nfs_acl rfcomm l2cap
bluetooth sunrpc iptable_filter ip_tables ip6t_REJECT xt_tcpudp
ip6table_filter ip6_tables x_tables cpufreq_ondemand acpi_cpufreq
freq_table rdma_ucm ib_sdp rdma_cm iw_cm ib_addr ib_ipoib ib_cm ib_sa
ipv6 ib_uverbs ib_umad iw_nes ib_ipath ib_mthca dm_multipath scsi_dh
raid0 sbs sbshc battery acpi_memhotplug ac parport_pc lp parport mlx4_ib
ib_mad ib_core e1000e sr_mod joydev cdrom mlx4_core i5000_edac edac_core
shpchp rtc_cmos sg pcspkr rtc_core rtc_lib i2c_i801 i2c_core serio_raw
button dm_snapshot dm_zero dm_mirror dm_log dm_mod usb_storage ata_piix
libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last
unloaded: microcode]
Pid: 0, comm: swapper Not tainted 2.6.27_ofa_compil #2
Call Trace:
<IRQ> [<ffffffff80235b8d>] warn_on_slowpath+0x51/0x77
[<ffffffff80229b79>] __wake_up+0x38/0x4f
[<ffffffff80246d57>] __wake_up_bit+0x28/0x2d
[<ffffffffa05485af>] rpc_wake_up_task_queue_locked+0x223/0x24b [sunrpc]
[<ffffffffa054861e>] rpc_wake_up_status+0x47/0x82 [sunrpc]
[<ffffffff80239c49>] local_bh_enable_ip+0x3c/0x92
[<ffffffffa0638fd1>] rpcrdma_conn_func+0x6d/0x7c [xprtrdma]
[<ffffffffa063b316>] rpcrdma_qp_async_error_upcall+0x45/0x5a [xprtrdma]
[<ffffffffa0294bb3>] mlx4_ib_qp_event+0xf9/0x100 [mlx4_ib]
[<ffffffff802443da>] __queue_work+0x22/0x32
[<ffffffffa01fc5d4>] mlx4_qp_event+0x8a/0xad [mlx4_core]
[<ffffffffa01f50a5>] mlx4_eq_int+0x55/0x291 [mlx4_core]
[<ffffffffa01f52f0>] mlx4_msi_x_interrupt+0xf/0x16 [mlx4_core]
[<ffffffff802624f4>] handle_IRQ_event+0x25/0x53
[<ffffffff80263c0a>] handle_edge_irq+0xe3/0x123
[<ffffffff8020e907>] do_IRQ+0xf1/0x15e
[<ffffffff8020c381>] ret_from_intr+0x0/0xa
<EOI> [<ffffffffa0549c3e>] nul_marshal+0x0/0x20 [sunrpc]
[<ffffffff80212474>] mwait_idle+0x41/0x45
[<ffffffff8020abdf>] cpu_idle+0x7e/0x9c
---[ end trace 5cc994fbe7e141af ]---
rpcrdma: connection to 192.168.0.215:2050 closed (-103)
rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 5 slots 32
ird 16
rpcrdma: connection to 192.168.0.215:2050 closed (-103)
Thanks,
Diego
Vu Pham wrote:
> Celine Bourde wrote:
>> We have still the same problem, even changing the registration method.
>>
>> mount doesn't reply and this is the output of dmesg on client:
>>
>> rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 6 slots 32
>> ird 16
>> rpcrdma: connection to 192.168.0.215:2050 closed (-103)
>> rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 6 slots 32
>> ird 16
>> rpcrdma: connection to 192.168.0.215:2050 closed (-103)
>> ib0: multicast join failed for
>> ff12:401b:ffff:0000:0000:0000:0000:0001, status -22
>> rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 6 slots 32
>> ird 16
>> rpcrdma: connection to 192.168.0.215:2050 closed (-103)
>>
>> I have still another doubt: if the firmware is the problem, why is NFS
>> RDMA working with a kernel 2.6.27.10 and without OFED 1.4 with these
>> same cards??
>
> On 2.6.27.10 nfsrdma does not use fast registration work request;
> therefore, it works well with connectX
>
> From 2.6.28 and so on, nfsrdma start implementing/using fast
> registration work request and commit without verifying it with connectX
>
> I'm looking and trying to resolve those glitches/issues now
>
> -vu
>
>>
>> Thanks,
>>
>> Céline Bourde.
>>
>> Tom Talpey wrote:
>>> At 06:56 AM 4/27/2009, Celine Bourde wrote:
>>>
>>>> Thanks for the explanation.
>>>> Let me know if you have additional information.
>>>>
>>>> We have a contact at Mellanox. I will contact him.
>>>>
>>>> Thanks,
>>>>
>>>> Céline.
>>>>
>>>> Vu Pham wrote:
>>>>
>>>>> Celine,
>>>>>
>>>>> I'm seeing mlx4 in the log so it is connectX.
>>>>>
>>>>> nfsrdma does not work with any official connectX' fw release 2.6.0
>>>>> because of fast registering work request problems between nfsrdma
>>>>> and the firmware.
>>>>>
>>>
>>> There is a very simple workaround if you don't have the latest mlx4
>>> firmware.
>>>
>>> Just set the client to use the all-physical memory registration mode.
>>> This will
>>> avoid making unsupported reregistration requests, which the firmware
>>> advertised.
>>>
>>> Before mounting, enter (as root)
>>>
>>> sysctl -w sunrpc.rdma_memreg_strategy = 6
>>>
>>> The client should work properly after this.
>>>
>>> If you do have access to the fixed firmware, I recommend using the
>>> default
>>> setting (5) as it provides greater safety on the client.
>>>
>>> Tom.
>>>
>>>
>>>>> We are currently debugging/fixing those problems.
>>>>>
>>>>> Do you have direct contact with Mellanox field application
>>>>> engineer? Please contact him/her.
>>>>> If not I can send you a contact on private channel.
>>>>>
>>>>> thanks,
>>>>> -vu
>>>>>
>>>>>
>>>>>> Hi Celine,
>>>>>>
>>>>>> What HCA do you have on your system? Is it ConnectX? If yes, what
>>>>>> is its firmware version?
>>>>>>
>>>>>> -vu
>>>>>>
>>>>>>
>>>>>>> Hey Celine,
>>>>>>>
>>>>>>> Thanks for gathering all this info! So the rdma connections work
>>>>>>> fine with everything _but_ nfsrdma. And errno 103 indicates the
>>>>>>> connection was aborted, maybe by the server (since no failures
>>>>>>> are logged by the client).
>>>>>>>
>>>>>>>
>>>>>>> More below:
>>>>>>>
>>>>>>>
>>>>>>> Celine Bourde wrote:
>>>>>>>
>>>>>>>> Hi Steve,
>>>>>>>>
>>>>>>>> This email summarizes the situation:
>>>>>>>>
>>>>>>>> Standard mount -> OK
>>>>>>>> ---------------------
>>>>>>>>
>>>>>>>> [root at twind ~]# mount -o rw 192.168.0.215:/vol0 /mnt/
>>>>>>>> Command works fine.
>>>>>>>>
>>>>>>>> rdma mount -> KO
>>>>>>>> -----------------
>>>>>>>>
>>>>>>>> [root at twind ~]# mount -o rdma,port=2050 192.168.0.215:/vol0 /mnt/
>>>>>>>> Command blocks ! I should perform Ctr+C to kill process.
>>>>>>>>
>>>>>>>> or
>>>>>>>>
>>>>>>>> [root at twind ofa_kernel-1.4.1]# strace mount.nfs
>>>>>>>> 192.168.0.215:/vol0 /mnt/ -o rdma,port=2050
>>>>>>>> [..]
>>>>>>>> fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
>>>>>>>> connect(3, {sa_family=AF_INET, sin_port=htons(610),
>>>>>>>> sin_addr=inet_addr("127.0.0.1")}, 16) = 0
>>>>>>>> fcntl(3, F_SETFL, O_RDWR) = 0
>>>>>>>> sendto(3,
>>>>>>>>
>>>> "-3\245\357\0\0\0\0\0\0\0\2\0\1\206\270\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0"...,
>>>>
>>>>>>>> 40, 0, {sa_family=AF_INET, sin_port=htons(610),
>>>>>>>> sin_addr=inet_addr("127.0.0.1")}, 16) = 40
>>>>>>>> poll([{fd=3, events=POLLIN}], 1, 3000) = 1 ([{fd=3,
>>>>>>>> revents=POLLIN}])
>>>>>>>> recvfrom(3,
>>>>>>>> "-3\245\357\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 8800,
>>>>>>>> MSG_DONTWAIT, {sa_family=AF_INET, sin_port=htons(610),
>>>>>>>> sin_addr=inet_addr("127.0.0.1")}, [16]) = 24
>>>>>>>> close(3) = 0
>>>>>>>> mount("192.168.0.215:/vol0", "/mnt", "nfs", 0,
>>>>>>>> "rdma,port=2050,addr=192.168.0.215"
>>>>>>>> ..same problem
>>>>>>>>
>>>>>>>> [root at twind tmp]# dmesg
>>>>>>>> rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 5
>>>>>>>> slots 32 ird 16
>>>>>>>> rpcrdma: connection to 192.168.0.215:2050 closed (-103)
>>>>>>>> rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 5
>>>>>>>> slots 32 ird 16
>>>>>>>> rpcrdma: connection to 192.168.0.215:2050 closed (-103)
>>>>>>>> rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 5
>>>>>>>> slots 32 ird 16
>>>>>>>> rpcrdma: connection to 192.168.0.215:2050 closed (-103)
>>>>>>>> rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 5
>>>>>>>> slots 32 ird 16
>>>>>>>> rpcrdma: connection to 192.168.0.215:2050 closed (-103)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> Is there anything logged on the server side?
>>>>>>>
>>>>>>> Also, can you try this again, but on both systems do this before
>>>>>>> attempting the mount:
>>>>>>>
>>>>>>> echo 32768 > /proc/sys/sunrpc/rpc_debug
>>>>>>>
>>>>>>> This will enable all the rpc trace points and add a bunch of
>>>>>>> logging to /var/log/messages.
>>>>>>> Maybe that will show us something. It think the server is
>>>>>>> aborting the connection for some reason.
>>>>>>>
>>>>>>> Steve.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> general mailing list
>>>>>>> general at lists.openfabrics.org
>>>>>>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>>>>>>>
>>>>>>> To unsubscribe, please visit
>>>>>>> http://openib.org/mailman/listinfo/openib-general
>>>>>>>
>>>>>> _______________________________________________
>>>>>> general mailing list
>>>>>> general at lists.openfabrics.org
>>>>>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>>>>>>
>>>>>> To unsubscribe, please visit
>>>>>> http://openib.org/mailman/listinfo/openib-general
>>>>>>
>>>>> _______________________________________________
>>>>> general mailing list
>>>>> general at lists.openfabrics.org
>>>>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>>>>>
>>>>> To unsubscribe, please visit
>>>>> http://openib.org/mailman/listinfo/openib-general
>>>>>
>>>>>
>>>>>
>>>> _______________________________________________
>>>> general mailing list
>>>> general at lists.openfabrics.org
>>>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>>>>
>>>> To unsubscribe, please visit
>>>> http://openib.org/mailman/listinfo/openib-general
>>>>
>>>>
>>>
>>>
>>>
>>>
>>
>
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-general
>
>
More information about the general
mailing list