[ofw] Issue w/ Multipathing SRP in WinOF 2.3 RC4 w/ >4 drives

Chris Worley worleys at gmail.com
Thu Oct 14 09:24:21 PDT 2010


I've root-caused this issue as a problem w/ Microsoft's MPIO
implementation.  While they claim a 16 character Product ID, it seems
as if they only bothered to match the first 8 characters (for which
there were only four unique Product ID's to the eighth character).

Chris
On Wed, Oct 13, 2010 at 10:08 AM, Chris Worley <worleys at gmail.com> wrote:
> In speaking with someone who knows MPIO internals, and I paraphrase,
> MPIO keys off of Vid/pid combinations, for which it's obviously only
> seeing four unique combinations; possibly the other four have
> dynamically changed or have disappeared.
>
> I do worry about the "disappear" notion: whenever I add elements to
> the MPIO known set, I see an SRP logout on the target, then a log back
> in.  It makes me wonder if there's a timing issue.  This goes along
> with MPIO discovery initially only shows four drives, even though
> Device Manager sees all eight twice (and knows which are duplicates).
> When I register the 4 drives that MPIO discovers, and return to the
> discovery window, it shows two more drives, after adding those two, it
> then shows two more... but after adding all eight, only four drives
> are exposed by MPIO... so it almost seems like a timing issue where
> SRP may be exposing one at a time, and MPIO gives up after four are
> exposed.
>
> Also, they wonder why the product ID is unique for each device.  When
> I register these w/ SCST, I use syntax like:
>
> echo "open identifier /dev/block-device ... " > /proc/scsi_tgt/vdisk/vdisk
>
> ... and each of those identifiers must be unique for each block
> device.  Is my understanding correct, or can I have one name
> associated w/ multiple block devices?  If I try to do that, I get a
> logged error:
>
> dev_vdisk: ***ERROR***: Virtual device with name foo already exist
>
> Yet on Windows, this identifier is being treated as a product ID,
> which "someone who knows" indicates would be inappropriate.
>
> So there are two questions here:
>
> 1) Vid/pid unique combinations for each drive exported, and not being
> changed or disappearing, and
> 2) Vdisk identifiers on the target being inappropriately used as
> Product Names on the Windows initiator.
>
> Thanks,
>
> Chris
> P.S. sorry for the top post, I figure nobody would wade through all
> the associated data below to get to the new information.
> On Tue, Oct 12, 2010 at 10:42 AM, Chris Worley <worleys at gmail.com> wrote:
>> On Mon, Oct 11, 2010 at 1:26 PM, Chris Worley <worleys at gmail.com> wrote:
>>> On Mon, Oct 11, 2010 at 12:37 PM, Chris Worley <worleys at gmail.com> wrote:
>>>> While multipathing helps SRP shortcomings in many respects, including
>>>> performance, it seems Windows (W2K8R2 Standard) multipathing gets very
>>>> confused with >4 drives.
>>>>
>>>> With 8 drives exported to Windows over SRP, without Multipath
>>>> installed/configured, Windows correctly shows 8 drives and 8
>>>> duplicates, and labels the duplicates as offline.
>>>>
>>>>  In configuring multipathing, the "MPIO Properties" dialog in the
>>>> "Discover Multi-paths" tab initially shows four drives.  Add those and
>>>> (w/o rebooting) return to the tab shows two more; add those and return
>>>> to the tab shows the last two.  With all 8 added, reboot, and "Device
>>>> Manager" shows 4 MPIO SRP drives (the four it showed initially in the
>>>> "Discover Multi-paths" tab).
>>>>
>>>> Looking at the MPIO properties of two of those drives shows two paths
>>>> each, the other two drives show six paths each (there are only two
>>>> paths).  When I try to change the "Policy" from the "Round Robin"
>>>> default to "Least Queue Depth", the two drives with two paths have no
>>>> issue; the two drives with six paths put up an error message that
>>>> merely describes the policy, but not any error, and the change won't
>>>> take.
>>>>
>>>> I uninstalled MPIO and reinstalled it, and the same occurred.
>>>> Strangely, after the reinstall, the "Least Queue Depth" policy stuck
>>>> on drives I had assigned that to before... as if the uninstall didn't
>>>> clean the registry as well as it should have.
>>>>
>>>> Any ideas?
>>>
>>> Text output from Windows (you can see the four drives it allows me to
>>> use, but the eight drives it sees):
>>>
>>> C:\Users\Administrator>mpclaim -e
>>>
>>> "Target H/W Identifier   "   Bus Type     MPIO-ed      ALUA Support
>>> -------------------------------------------------------------------------
>>> "SCST_BIOfio-71962       "   Fibre        YES          ALUA Not Supported
>>> "SCST_BIOfio-71957       "   Fibre        YES          ALUA Not Supported
>>> "SCST_BIOfio-41000       "   Fibre        YES          ALUA Not Supported
>>> "SCST_BIOfio-40948       "   Fibre        YES          ALUA Not Supported
>>>
>>> C:\Users\Administrator>mpclaim -h
>>>
>>> "MSDSM Supported DeviceId"
>>> -------------------------------------------------------------------------
>>> "Vendor 8Product       16"
>>> "SCST_BIOfio-40948       "
>>> "SCST_BIOfio-71962       "
>>> "SCST_BIOfio-71957       "
>>> "SCST_BIOfio-41000       "
>>> "SCST_BIOfio-41001       "
>>> "SCST_BIOfio-71964       "
>>> "SCST_BIOfio-41002       "
>>> "SCST_BIOfio-71965       "
>>>
>>
>> More info to add:
>>
>> If I totally delete MPIO, the registry still shows entries for the
>> four drives under all the ControlSets, including Current... and these
>> can't be deleted.
>>
>> The registry also shows the SCSI port settings: two different ports w/
>> eight LUNS each, and for each LUN in one port, the name and serial
>> numbers differ, and match exactly the name and serial numbers at the
>> other port.  I can't imagine what else MPIO would be keying off of.
>>
>> I also see SRP logouts and logins generated when I change the MPIO
>> settings, and wondering if MPIO is getting confused in the in-between
>> state.
>>
>> Here's more mpclaim output information showing the four drives it
>> recognizes, plus the two drives w/ the four extra paths each...
>> showing the other LUNS where there is no reason to believe they are
>> the same disk...
>>
>> MPIO Storage Snapshot on Monday, 11 October 2010, at 12:34:03.674
>>
>> Registered DSMs: 1
>> ================
>> +--------------------------------|-------------------|----|----|----|---|-----+
>> |DSM Name                        |      Version      |PRP | RC | RI |PVP| PVE |
>> |--------------------------------|-------------------|----|----|----|---|-----|
>> |Microsoft DSM                   |006.0001.07600.16385|0020|0003|0001|030|False|
>> +--------------------------------|-------------------|----|----|----|---|-----+
>>
>>
>> Microsoft DSM
>> =============
>> MPIO Disk3: 02 Paths, Least Queue Depth, ALUA Not Supported
>>       SN: 66696F2D34303934
>>       Supported Load Balance Policies: FOO RR RRWS LQD WP LB
>>
>>   Path ID          State              SCSI Address      Weight
>>   ---------------------------------------------------------------------------
>>   0000000077030000 Active/Optimized   003|000|000|006   0
>>       Adapter: InfiniBand SRP Miniport...                (B|D|F: 000|000|000)
>>       Controller: 46616B65436F6E74726F6C6C6572 (State: Active)
>>
>>   0000000077010000 Active/Optimized   001|000|000|006   0
>>       Adapter: InfiniBand SRP Miniport...                (B|D|F: 000|000|000)
>>       Controller: 46616B65436F6E74726F6C6C6572 (State: Active)
>>
>> MPIO Disk2: 06 Paths, Least Queue Depth, ALUA Not Supported
>>       SN: 66696F2D34313030
>>       Supported Load Balance Policies: FOO RR RRWS LQD WP LB
>>
>>   Path ID          State              SCSI Address      Weight
>>   ---------------------------------------------------------------------------
>>   0000000077030000 Active/Optimized   003|000|000|007   0
>>       Adapter: InfiniBand SRP Miniport...                (B|D|F: 000|000|000)
>>       Controller: 46616B65436F6E74726F6C6C6572 (State: Active)
>>
>>   0000000077030000 Active/Optimized   003|000|000|005   0
>>       Adapter: InfiniBand SRP Miniport...                (B|D|F: 000|000|000)
>>       Controller: 46616B65436F6E74726F6C6C6572 (State: Active)
>>
>>   0000000077030000 Active/Optimized   003|000|000|004   0
>>       Adapter: InfiniBand SRP Miniport...                (B|D|F: 000|000|000)
>>       Controller: 46616B65436F6E74726F6C6C6572 (State: Active)
>>
>>   0000000077010000 Active/Optimized   001|000|000|007   0
>>       Adapter: InfiniBand SRP Miniport...                (B|D|F: 000|000|000)
>>       Controller: 46616B65436F6E74726F6C6C6572 (State: Active)
>>
>>   0000000077010000 Active/Optimized   001|000|000|005   0
>>       Adapter: InfiniBand SRP Miniport...                (B|D|F: 000|000|000)
>>       Controller: 46616B65436F6E74726F6C6C6572 (State: Active)
>>
>>   0000000077010000 Active/Optimized   001|000|000|004   0
>>       Adapter: InfiniBand SRP Miniport...                (B|D|F: 000|000|000)
>>       Controller: 46616B65436F6E74726F6C6C6572 (State: Active)
>>
>> MPIO Disk1: 06 Paths, Least Queue Depth, ALUA Not Supported
>>       SN: 66696F2D37313936
>>       Supported Load Balance Policies: FOO RR RRWS LQD WP LB
>>
>>   Path ID          State              SCSI Address      Weight
>>   ---------------------------------------------------------------------------
>>   0000000077030000 Active/Optimized   003|000|000|003   0
>>       Adapter: InfiniBand SRP Miniport...                (B|D|F: 000|000|000)
>>       Controller: 46616B65436F6E74726F6C6C6572 (State: Active)
>>
>>   0000000077030000 Active/Optimized   003|000|000|002   0
>>       Adapter: InfiniBand SRP Miniport...                (B|D|F: 000|000|000)
>>       Controller: 46616B65436F6E74726F6C6C6572 (State: Active)
>>
>>   0000000077030000 Active/Optimized   003|000|000|001   0
>>       Adapter: InfiniBand SRP Miniport...                (B|D|F: 000|000|000)
>>       Controller: 46616B65436F6E74726F6C6C6572 (State: Active)
>>
>>   0000000077010000 Active/Optimized   001|000|000|003   0
>>       Adapter: InfiniBand SRP Miniport...                (B|D|F: 000|000|000)
>>       Controller: 46616B65436F6E74726F6C6C6572 (State: Active)
>>
>>   0000000077010000 Active/Optimized   001|000|000|002   0
>>       Adapter: InfiniBand SRP Miniport...                (B|D|F: 000|000|000)
>>       Controller: 46616B65436F6E74726F6C6C6572 (State: Active)
>>
>>   0000000077010000 Active/Optimized   001|000|000|001   0
>>       Adapter: InfiniBand SRP Miniport...                (B|D|F: 000|000|000)
>>       Controller: 46616B65436F6E74726F6C6C6572 (State: Active)
>>
>> MPIO Disk0: 02 Paths, Least Queue Depth, ALUA Not Supported
>>       SN: 66696F2D37313935
>>       Supported Load Balance Policies: FOO RR RRWS LQD WP LB
>>
>>   Path ID          State              SCSI Address      Weight
>>   ---------------------------------------------------------------------------
>>   0000000077030000 Active/Optimized   003|000|000|000   0
>>       Adapter: InfiniBand SRP Miniport...                (B|D|F: 000|000|000)
>>       Controller: 46616B65436F6E74726F6C6C6572 (State: Active)
>>
>>   0000000077010000 Active/Optimized   001|000|000|000   0
>>       Adapter: InfiniBand SRP Miniport...                (B|D|F: 000|000|000)
>>       Controller: 46616B65436F6E74726F6C6C6572 (State: Active)
>>
>> MSDSM-wide default load balance policy: N\A
>>
>> No target-level default load balance policies have been set.
>>
>> ================================================================================
>>
>> Is there anybody out there who made it this far through the verbiage?
>>
>> Thanks,
>>
>> Chris
>>
>



More information about the ofw mailing list