[Users] APM, LMC, and iSER; oh my!

Jack Wang xjtuwjp at gmail.com
Wed Apr 6 01:59:23 PDT 2016


About APM, Oracle has patch for that, FYI:

https://oss.oracle.com/git/?p=linux-uek3-3.8.git;a=commitdiff;h=c8ead45c3e5abed09586a4d51826e429c225cba1

2016-04-05 19:54 GMT+02:00 Robert LeBlanc <robert at leblancnet.us>:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> I was hoping someone here would have had experience with some of the
> technologies listed here. Is there a different mailing list that this
> question may be more appropriate?
>
> Thanks,
> -----BEGIN PGP SIGNATURE-----
> Version: Mailvelope v1.3.6
> Comment: https://www.mailvelope.com
>
> wsFcBAEBCAAQBQJXA/u9CRDmVDuy+mK58QAAwtIP/Rb2daxyGWu8ve0ro2qX
> GK4Qog+XMYbn+7GIXQUFATkDzYDdWLL69vO12WbF1I4q5I1b04FXgVktUh/1
> unbBKwhByf4XVvmLn+6EdYaZJpn1iyLjxzp8RLJe9wA9NvBb9Fhx/qrWxYOg
> 14KB6+ht28US0Gaxcl3epjp/SIGDHQRehbN1WIjffqeE6dPqc9iP5NepCvom
> G7sXXoYPo2pSR4IYYBqyaKSi9VX5zKxt898XHPNyf4RS5PG6JsAVkxF4iHIy
> c1YFgZ1qSXJ7p/6OFcxz+NBUnFj/uQh/9XbTgpv/RslgEh+S+WxcUl1RukVa
> b9yBAWxm0YODvkScgJglVk28lP0p6GULR48llx7GJIRyWC1Sz3Pz3f5p/GuR
> XP1oP+hAD7ULEB0Qo5Xr0dLSLR0CFQ2LdNUtGCEdCplavMimmgUajzea2QPo
> hOx4DJR2LarOuBTXSur6T8WlxLa7JCWql8eXQAWMTd+uDeIFmm2Tj8+CLXfE
> 92Cs7f1UkunhXsswrbzpD07fyH2Tzng8l4I2AOgPYuN6JGT7wVNnXkHOKgPg
> ak6AOlSXmPwCmkkKP39h2PwKkLzm3/fywsWStTGiK39i583BGj6jjsDxkmC7
> 5vpIXfmIJBhYLf0h6FVloBUCfU0HZx27dfKWkA0kFcsxUmX1q1BOaLuH3lCl
> /cHi
> =1gyN
> -----END PGP SIGNATURE-----
> ----------------
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>
>
> On Wed, Mar 30, 2016 at 1:58 PM, Robert LeBlanc <robert at leblancnet.us> wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA256
>>
>> I've been trying to understand Automatic Path Migration (APM) in order
>> to provide extreme stability in our Infiniband fabric[0][1]. My
>> research has also lead me to LID Mask Count (LMC) as a possibility for
>> providing hot-spot avoidance[2]. In fact it seems like the hot-spot
>> avoidance could also provide detection of failed paths and remove them
>> from the stripe set, providing benefits of both approaches.
>>
>> From the articles, it seems that additional code had to be written to
>> leverage either of the technologies and was targeted for MPI
>> applications. In the APM case three modules: Alternate Path
>> Specification module, Path Loading Request module, and Path Migration
>> module which handles different aspects of configuring APM and
>> controlling the failover/failback of the paths. In the case of [2] a
>> shim module was created that performed the striping and consolidating
>> of the data across the links.
>>
>> We'd like to leverage one or both of these features in our
>> environment. Enabling LMC in the subnet manager seems pretty straight
>> forward, then the application has to leverage the multiple LIDs. I
>> can't find any good documentation regarding configuring the
>> environment for APM. Based on what I've read, you have to send the
>> alternate path to the verbs command to modify the QP, but I'm not sure
>> how to get the alternate path to begin with. Could this be another LID
>> pair from LMC, does it require a separate PKEY?
>>
>> In the "APM support for IPoIB" [3] thread, it sounds like APM and
>> potentially [2] can't be performed over discrete adapters which would
>> be really helpful for us. It sounds like there may have been some
>> progress in this area over the last 3.5 years, but I can't seem to
>> find anything.
>>
>> Our primary use is iSER at the moment, and even with three links, we
>> have situations where the majority of the paths go through a single
>> switch. We would like to have paths forced through different switches
>> and balance the loads as much as possible. I'd be grateful for any
>> links to documentation or any discussions that will help me get past
>> this roadblock in my understanding. We are currently exporting the
>> iSER target multiple times and using multipath on the client, but it
>> seems Infiniband could do this a lot better.
>>
>> [0] https://www.researchgate.net/publication/220952412_Automatic_Path_Migration_over_InfiniBand_Early_Experiences
>> [1] http://hpc.pnl.gov/people/vishnu/public/vishnu_cluster09.pdf
>> [2] http://hpc.pnl.gov/people/vishnu/public/vishnu_ccgrid07.pdf
>> [3] http://comments.gmane.org/gmane.linux.drivers.rdma/13529
>>
>> Thanks,
>> - ----------------
>> Robert LeBlanc
>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>> -----BEGIN PGP SIGNATURE-----
>> Version: Mailvelope v1.3.6
>> Comment: https://www.mailvelope.com
>>
>> wsFcBAEBCAAQBQJW/C+iCRDmVDuy+mK58QAAx20QALbKVtHKj7klgPJVMz9w
>> GXe+crZRFHqf/xtSfdh9xY9wau8U/zoPu1LDLxhZbtt3QD6evB2tZ9YN/lbD
>> /I9oznFHA2+0fWuPSwdptSfxPXIW0L9Ajl5SW59yO06GohAIVR7npqCL+xiL
>> fDP5qcYgOLiGGdAZ/QsZHlPW1NrpHIOnxtbNZiFUehhQtXvmWD1P15UPStw8
>> U1+PC26DDe0Svk8CPWF79ckfia1FuYqRNFGPJub64WkZpuA8pLjPi2mTnsJo
>> 43cibeggx0uE6/EL0XDvqtjOZgFiBvh7wlXqMmHB+cbyepDUaCyt2CSz+nnr
>> UXc/Pbx5JB/NfqBmC+fh8z91kLoIh7Q+CTFZGXs3Ho7lush9SOUkNFo2W8sr
>> TR5fSoHw1804AqEOX3+9LHfb3m5p4imWxl6uTpdVBW/vWGn5BdAeOFlgndeU
>> cu4NwK0JNZDNDw3KKSsr+iQM2+H8vwhV3Ayw4dAL9Uu0ZPku6vp8aFFrpOyY
>> MdOVQgUsoXrqmKDgyheILxzpt3kH/E/GK6g5/w6oO5Ohj56x/cRF81ylVCDx
>> VnVZeDAFUr9Sr/4fNaHc+HGXC4KjOs5GliEJmXDHZlgAtmAjddhIxDkEWEAX
>> 8axiEVpOHwwboZAtYC66AKZEPBZjq+TZjUiuceUxw7dVDoB+tA+ACmLZL39s
>> 01nz
>> =/M34
>> -----END PGP SIGNATURE-----
> _______________________________________________
> Users mailing list
> Users at lists.openfabrics.org
> http://lists.openfabrics.org/mailman/listinfo/users



More information about the Users mailing list