[ofa-general] SRP target sporadic behaviour with Solaris, VMware
Vu Pham
vuhuong at mellanox.com
Wed Mar 12 15:18:00 PDT 2008
Bart Van Assche wrote:
> On Wed, Mar 12, 2008 at 12:03 AM, Daniel Pocock <daniel at pocock.com.au> wrote:
>> I've recently set up the SRP target module on Linux (2.6.22).
>>
>> Trying to access the target from various initiators (Fedora, Debian,
>> Solaris 10, VMWare ESX 3.5) gives mixed results.
>>
>> The Linux clients, despite having limited configuration tools, worked
>> immediately.
>>
>> I've opened a thread on the Sun forums to discuss the Solaris 10 issue:
>>
>> http://forum.java.sun.com/thread.jspa?threadID=5273631
>>
>> On VMware:
>> - I had to reboot my new VMware ESX server a few times before it found
>> my 500GB target.
>> - VMWare completely rejects a target if it doesn't have a partition
>> table - I ran parted on Linux and then VMWare was OK
>> - Also, the messages in VMWare gave me the impression it would clobber
>> the whole volume, rather than just a single partition - so to avoid the
>> possibility of losing my other partitions, I made a special target
>> representing the intended partition rather than the entire volume. Now
>> I have a VMware partition table nested within a partition.
>> - VMware only seems to show one target at a time - I had created a few
>> test targets, but I could only see one of them. Is this what other
>> people see? ibsrpdm on the other Linux hosts shows all the targets.
>
> My experience with SRP is as follows (with Linux 2.6.24 + SCST + SRPT
> as target):
> * Linux SRP initiator: works perfectly.
> * OpenSolaris SRP initiator: I could not get Sun's SRP initiator
> working on OpenSolaris. I even asked a Solaris expert to help me, but
> he couldn't get the SRP initiator working either.
> * VMware ESX 3.5 + Mellanox InfiniBand drivers (released in January
> 2008): until now I only have tested a setup with a single target. When
> doing a lot of I/O over the SRP connection, after about 10 minutes the
> virtual machine running on VMware starts logging communication errors.
> I reported this yesterday to Mellanox support, and Mellanox is
> currently working on this issue. Note: I had to upgrade the InfiniBand
> switch firmware before the ESX server was able to find the SRP target.
>
Which mode of virtual disk did you use (rmd, rdmp, vmfs, raw)?
Could you provide Mellanox FAE both VM /var/log/messages and
esx's /var/log/vmkernel?
I'll look over them
-vu
More information about the general
mailing list