[ofw] SRP, windows and OpenSolaris

Grüninger, Andreas (LGL Extern) Andreas.Grueninger at lgl.bwl.de
Fri Oct 8 03:58:18 PDT 2010


Hi all

I appended at the bottom the initial email for a converstaion with Tzachi concerning my problem with SRPT target OpenSolaris B134 and the mellanox drivers.
In the meantime I downloaded the source of WinOF, changed the source with the attached diff files, compiled it and checked the debug output of the checked version.
With the changes from the diff files and the changed inf file the srp driver in Windows 2003 x64 works.
Actually we are running the mellanox drivers 2.1.1 with an ib_srp.sys driver changed from Tzachi.
During the weekend I will compile and install the RC3 or RC4 driver from WinOF.

We have with the Mellanox and the WinOF driver versions the same problem. 
When the srp driver is active and the server is rebooted it will hang for ever and must be rebooted via the power button of the remote management board.
When the srp driver is deactived and IPOIB is active the reboot succeeds.
I expect no other results from the RC candidates because this source code seems not to be changed. But I will check it.

Questions:
- Any chance to get some hints to debug the hanging system?
- Is there an access available over HTTP to the svn repository (the firewall does not allow svn protocol)?

Kind Regards

Andreas


> Hallo
> 
> I installed
> - SRP target OpenSolaris B134 with SUNWhermon driver
> - MLNX_VPI_wnet_x64.msi on Windows Server 2003 x64 SP2
> - MLNX_OFED_LINUX v1.4 with OpenSM 3.3.0 on SLES10 SP2
> - VMWare release-1.4.1-222
> - Infiband cards: ConnectX Dual-Port DDR MHQH29-XTC
> 
> The SRP targets can be used with initiators from Linux and ESX.
> 
> On Windows the SRP target is not recognized.
> 
> After the installation of the Mellanox driver package the target is 
> shown in the device manager as:
> Other devices
>    ? Solaris SRP Target 0.9a
> 
> In the details of the device I found for "Compatible Ids" the values
> IBA\C0100c690ep0108r0001
> IBA\C0100c690ep0108
> 
> In ib_srp.inf these values are defined [SRP.DeviceSection.ntx86] 
> %SRP.DeviceDesc% = SRP.DDInstall,IBA\C0100c609ep0108r0001, \
> 
> 	IBA\Cff00c609ep0108r0001, \
> 								IBA\C0100c609ep0108, \
> 								IBA\Cff00c609ep0108
> 
> This seems to be a number turner.
> Most probably the error is on the side of OpenSolaris sending the 
> wrong ids.
> 
> When I change the inf like this
> 
> [SRP.DeviceSection.ntamd64]
> %SRP.DeviceDesc% = SRP.DDInstall,IBA\C0100c609ep0108r0001, \
> 
> 	IBA\Cff00c609ep0108r0001, \
> 								IBA\C0100c609ep0108, \
> 								IBA\Cff00c609ep0108, \
> 
> IBA\C0100c690ep0108r0001, \
> 								IBA\C0100c690ep0108
> the miniport driver can be installed and is put under
> 
> 
> SCSI and RAID controllers
>    Mellanox Infiniband SRP Miniport
> 
> But the driver cannot be started with the message "This device cannot 
> start. (Code 10)"


-----Ursprüngliche Nachricht-----
Von: ofw-bounces at lists.openfabrics.org [mailto:ofw-bounces at lists.openfabrics.org] Im Auftrag von Smith, Stan
Gesendet: Donnerstag, 7. Oktober 2010 21:14
An: Chris Worley
Cc: ofw at lists.openfabrics.org; scst-devel
Betreff: Re: [ofw] [ANNOUNCE] winOFED 2.3 RC3 available for download

Chris Worley wrote:
> On Tue, Oct 5, 2010 at 12:37 PM, Smith, Stan <stan.smith at intel.com>
> wrote:
>> Hi chris,
>>
>> Chris Worley wrote:
>>> Are there any testing procedures for the Windows SRP initiator?  If 
>>> so, where can they be found?
>>
>> No specific SRP tests are in the SVN source tree.
>> Mellanox used to be the SRP maintainers although they have backed 
>> away from this due to other pressing concerns.
>>
>> SRP testing has been limited to installing Windows SRP drivers and 
>> communicating with an OFED 1.4.1 system exporting vdisks.
>> Once the windows client sees the vdisks, multi-gigabyte files are 
>> copied to and back from the SRP target and then verified (fc.exe) to 
>> be the same bytes; basic functionality, performance not addressed.
>> Since there are no active SRP maintainers, SRP status is questionable 
>> at this juncture.
>> Care to join the party as an SRP maintainer?
>
> In general, what versions of Windows are used in testing non-SRP parts 
> of the stack?  W2K8R2 "Standard" seemed to work well... but 
> "Enterprise" seems to go nuts w/ interrupts and NUMA distribution, and 
> occasionally locks up.
>
> Thanks,
>
> Chris

Hi Chris,
  Prior to a winOFED GA (general availability) release, the code has been installed/uninstalled and tested on the following platform combinations using Mellanox HCAs; mostly InfiniHost, some ConnectX on svr2008 & svr2008 R2:

1) x64 - svr2003, win7(Pro), svr2008(Ent,Std) , svr2008 R2(Ent), svr2008 R2 HPC Edition, XP-64
2) x86 - svr2003, win7(Ult), svr2008(Ent), svr2008 R2, XP
3) ia64 - svr2003

How do you observe the afore mentioned Enterprise problems?
NUMA is not tested (no hardware), are you sure you are not speaking of win2k8-R2 DataCenter w.r.t.?
What IB hardware are you using?
Latest firmware?

stan.

PS: BTW, congrats on getting SRP somewhat working and identifing the latest SRP target release that works.

_______________________________________________
ofw mailing list
ofw at lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
-------------- next part --------------
A non-text attachment was scrubbed...
Name: srp_hba_c.diff
Type: application/octet-stream
Size: 898 bytes
Desc: srp_hba_c.diff
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20101008/1fe88392/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: srp_h.diff
Type: application/octet-stream
Size: 701 bytes
Desc: srp_h.diff
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20101008/1fe88392/attachment-0001.obj>


More information about the ofw mailing list