[ofw] WDK build environment migration thoughts

Alex Naslednikov alexn at mellanox.co.il
Thu May 1 06:36:06 PDT 2008


I would like to thank all of you personally for reviewing my commit!
I received a lot of emails, so I would like to put here some order and answer them all in one mail.
Also, a lot of questions had been already answered in my previous e-mails. For you convenience, I will answer them all again

Q1. I still didn't understand why we need this commit and which problem it solves?
Before this commit, we had several main problems:
   1. 32/64 compatibility problem
   2. Problem with signed extension of __ptr64 fields. This problem is relevant also to 64/64 systems, not only to 32/64
   3. Earlier versions of WDK had problem to compile code with __ptr64 pointer to functions in x86 CHK version (AL_API * __ptr64 ib_pfn_xxx). This problem was promised to be fixed in WDK6001.18001, but 2 months ago before we couldn't assume this.
   4. WDK porting
For more information, please read http://lists.openfabrics.org/pipermail/ofw/2008-April/002307.html
This post explains in details this issue and several related issues.

Q2. It sounds like the __ptr64 patch failed to meet its objective if 32/64 support is broken. Just deleting the __ptr64 attribute would have accomplished the same end result and been 'cleaner'.

Definitely not! __ptr64 and WDK migration patch comes to solve this problem! We have very limited time to fully test it on mixed (32/64) systems, but a lot of tests that failed before pass now. Of course, it may still contains bugs (one major bug have been found by Fab and I already fixed it).
Main issue here is that we removed all __ptr64 appearances, but still preserved on a unique shape of all IOCTL structures using TO_LONG_PTR macro, thus solving the problem with signed pointer extension.

Q3. Why can't we use PtrToPtr64() macro that solves the problem of signed pointer extension?

(See related links from http://lists.openfabrics.org/pipermail/ofw/2008-April/002307.html and first paragraph explanations as well)

Yes, it indeed solves it :-). And our initial solution used this technique. 
But this solution has it own limitations. First of all, consider what should you do?
At a first glance, it looks pretty simple: remove all __ptr64 except of ioctl structures, then find out all places were unsafe (signed) pointer conversion can take place, and put there PtrToPtr64().
But if you have struct inside struct inside ioctl struct (real case), you should pass them all manually and verify that all use of __ptr64 pointer are "safe", or put there PtrToPtr64().
Finally, I wasted a lot of time and efforts to MANUALLY find these places and patch that I got contained 10K lines. I am not sure somebody can do code review for it, and there's still a good possibility for bugs, cause changes were done manually.
Our current solution is much better because it based on automatic changes, much more readable and it will be much easier to debug it afterwards.

Q4. I can't understand why we need 4 different names for *PTR64 macro? 
Q5. Let's remove all void *PTR64 macro!
It's a key idea of our solution :-)
TO_LONG_PTR macro comes to replace __ptr64 inside ioctls.
The rest of macros were defined according to place where __ptr64 had been removed.
FUNC_PTR64 - inside function signature
VOID_PTR64 - inside raw code
STRUCT_PTR64 - inside struct that is know (for sure) it is not used inside ioctls
Etc.

These temporarily macros will be removed in the next release (or even after). Now, I'd like to preserve them for the case of debug: one can easily redefine them back to __ptr64 or just compare to old code. Please, do not remove this now!

Q6. The __ptr64 sign extension happened with the DDK too - so why is it suddenly an issue?  Is it because 32-bit apps running on a 64-bit OS get a lot of pointers with the MSb set?
Yep. And not only this! When trying our first solution (with PtrToPtr64 macro), I found several cases were long __ptr64 pointer got values from 32bit len pointers and handles. Now it's not the case.

Q7. Rebranding. Why some *.inf files were changed?
Sorry, guys, it was a merge problem. I will send patch in a separate mail

Q8. I wrote in my previous mail:

Please, be aware that one has to change WinOF modules that aren't in WinIB stack (like additional ulps : udapl, vnic etc.) according to new methodology.
Also, I'd like to point your attention, that these modules [UDAPL2, VNIC] will work as is on homogeneous systems (x86, x64), but not on mixed systems (x86 application on x64 kernel)
In addition, Microsoft fixed an internal compiler bug when compiling modules with long (__ptr64) pointers on functions (occurred only in x86 CHECKED environment).
So, you should not have problem with compilation after adjusting makefiles

Some people understood that current patch doesn't support 32/64 systems. 
I just wanted to say that UDAPL2, VNIC and other modules that are not in WinIB stack were not patched. But they should work in 32/32 and 64/64, after appropriate changes in Makefiles.

More to come.

XaleX




> _____________________________________________ 
> From: 	Alex Naslednikov  
> Sent:	Wednesday, April 30, 2008 10:20 AM
> To:	Alex Naslednikov; 'Smith, Stan'; Ishai Rabinovitz
> Cc:	'ofw at lists.openfabrics.org'
> Subject:	RE: [ofw] WDK build environment migration thoughts
> 
> Hello,
> I committed our WDK and __ptr64 patch into WinOF trunk, and WinOF and WinIB trunks were synchronized again.
> You can find below some further explanations :
> 
> 1. IBAL compiles now with WDK6001.18001. According to Microsoft, it should be the last and official release.
> We preserved the backward compatibility with DDK, but some intermediate versions of WDK may be incompatible
> 
> 2. Please, be aware that one has to change WinOF modules that aren't in WinIB stack (like additional ulps : udapl, vnic etc.) according to new methodology
> Also, I'd like to point your attention, that these modules will work as is on homogeneous systems (x86, x64), but not on mixed systems (x86 application on x64 kernel)
> In addition, Microsoft fixed an internal compiler bug when compiling modules with long (__ptr64) pointers on functions (occurred only in x86 CHECKED environment).
> So, you should not have problem with compilation after adjusting makefiles
> 
> 3. This revision contains:
>  3.1. All bugfixes from WinOF trunk, from rev. 939 to 1067
>  3.2. Mellanox __ptr64 solution and WDK poring, starting from rev. 2164
>  3.3. All bugfixes and patches from connectx branches (both Mellanox and WinOF)
> It was a large amount of code to be merged from 4 different svn trees (trunk and connectx branch in WinOF, and trunk and connectx branch in WinIB).
> We will appreciate your code review, just to be sure that we didn't forget to insert any minor patch or bug fix.
> 
> 4. I carefully tested new trunk inside Mellanox, on different platforms, both with DDK and WDK compilers. Please, update us about every minor problem during your testing.
> 
> Thanks,
> 
> Naslednikov Alexander (a.k.a XaleX)
> Windows Team
> Mellanox Technologies 
> 
> _____________________________________________ 
> From: 	Alex Naslednikov  
> Sent:	Monday, April 21, 2008 7:15 PM
> To:	Alex Naslednikov; 'Smith, Stan'; Ishai Rabinovitz
> Cc:	'ofw at lists.openfabrics.org'
> Subject:	RE: [ofw] WDK build environment migration thoughts
> 
> Hi all,
> I would like to repost my previous message, because I haven't received yet your comments.
> Our regression seems to be stable, so we are going to commit the change into WinOF trunk the nearest time.
> For you convenience, I also provide some typical changes as a patch (attached to this mail). Please, read the explanation below before - it will help you a lot.
> Be aware that all the modules not contained in Mellanox WinIB stack (like udapl, vnic) should be also changed according to this methodology.
> 
> It is very large change, so I'll appreciate your time and effort while reviewing the methodology and the patch itself.> 
> Thanks,
> 
> Naslednikov Alexander (a.k.a XaleX)
> Windows Team
> Mellanox Technologies
> 
> 
> 
> _____________________________________________ 
> From: 	Alex Naslednikov  
> Sent:	Thursday, April 10, 2008 4:09 PM
> To:	'Smith, Stan'; Ishai Rabinovitz
> Cc:	ofw at lists.openfabrics.org
> Subject:	RE: [ofw] WDK build environment migration thoughts
> 
> Hi all,
> It's a good idea to clarify some points before announcing Mellanox patch for WDK porting and __ptr64 problems.
> Hope, these explanations will be informative enough and not so long.
> 
> 1. __ptr64 problem
> Briefly speaking, this problem arises when copying 32bit len pointer into 64bit len pointer. In this case, signed pointer extension will take place.
> How it's applicable to WinOF ?  A lot of pointer were declared to be __ptr64 (i.e., to be always "long", even in 32bit kernel systems), that's to preserve on unique size of structs used in IOCTL calls.  The main problem it will cause is between 32bit user applications and 64bit kernel application.
> When user code do operation like 
> s_ptr = &my_struct;
> my_type* __ptr64 ptr = s_ptr;
> Than kernel will receive ptr with invalid upper bits data (4 bytes FF).
> To avoid signed pointer extension, PtrToPtr64() function should be used.
> Also, I found some other places where dangerous signed pointer extension took place, even on 32bit kernel.
> Yet another problem that arises with __ptr64 attribute is internal compiler error (C1001)  in WDK when using __ptr64 pointer to function (callback)
> This problem was described in ofw discussion, you can see also :
> http://blogs.msdn.com/texblog/archive/2005/10/31/487436.aspx
> http://lists.openfabrics.org/pipermail/ofw/2007-July/001613.html (posted by Jan from OFW)
> 
> Our solution:
> 1. Initially, we decided to remove all __ptr64 attributes except those ones inside IOCTL structures. After, put PtrToPtr64() conversion on every assignment to long pointer.
> (like my_type* __ptr64 ptr = PtrToPtr64(s_ptr);  )
> During this solution, we changed a huge amount of code, so patch became unreadable. And it was difficult to validate that all long pointer (with __ptr64 attribute) were used in a proper manner
> 
> 2. So, we decided about another solution:
>  All __ptr64 occurrences were replaced by either:
>  i) TO_LONG_PTR(type, field) macro, when occurred inside structure
> ii) VOID_PTR64 macro otherwise (defined as void macro)
> 
> #define CONCAT(str1, str2) str1##str2
> 
> #define TO_LONG_PTR(type,member_name) \
>     union { type member_name;  uint64_t CONCAT(member_name,_padding) ; }
> 
> Thus, we can both preserve on a uniform shapes of structs in user and kernel and to avoid unsafe pointer arithmetic !
> The patch now is much more readable, but it sill consist of thousands lines.
> 
> 2. Migration to WDK
> Main issue here was to preserve on backward compatibility with DDK
> We were able to compile our stack with WDK, while the main problems we found were :
> 
> 1. WDK uses newer version of SDK (SDK Vista). So, when using 2 or more versions of SDK on the same build machine, one has to update 
> PLATFORM_SDK_PATH variable to point on the proper version of SDK (for example, PLATFORM_SDK_PATH=%sysdrive%:\PROGRA~1\MI2578~1\windows\v6.1)
> 
> 2.verify.src script in WDK (new add-on) checks if your SOURCES file is in appropriate format.
> For example, you can't set implicitly path to system .dll in TARGETLIBS, but to use USE_<MODULE_NAME> =1 macro
> Example:
> Old code : 
>  ....
> TARGETLIBS= \
>    $(CRT_LIB_PATH)\msvcprt.lib\
>    $(SDK_LIB_PATH)\Ws2_32.lib\
>    $(TARGETPATH)\*\mtcr.lib
>  
> New code :
> USE_MSVCRT=1
> USE_NTDLL=1
>  
> TARGETLIBS= \
>    $(SDK_LIB_PATH)\Ws2_32.lib\
>    $(TARGETPATH)\*\mtcr.lib
> 
> 3. Some other problems, like mulitple includes error in .rc files, or problem with substituing more than one symbol constant into string in Makefiles (some version of WDK)
> 
> 
> Currently, we continue testing and will advertise these patches right after the testing will finish> 
> 
> Naslednikov Alexander (a.k.a XaleX)
> Windows Team
> Mellanox Technologies 
> 
> 
> -----Original Message-----
> From: ofw-bounces at lists.openfabrics.org [mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of Smith, Stan
> Sent: Tuesday, April 08, 2008 4:10 PM
> To: Ishai Rabinovitz
> Cc: ofw at lists.openfabrics.org
> Subject: [ofw] WDK build environment migration thoughts
> 
> Hello,
>   I strongly believe it would help the WinOF community in transitioning to the WDK build environment if the connectX branch
> (svn:gen1\branches\ConnectX) was used as a WDK build environment staging grounds prior to merging the WDK modifications into the mainline trunk.
> This has been talked about before although it still (as of last Friday) does not build using the latest WDK version.
>  
> One week prior to merging the WDK fixes into the mainline trunk, if you were to push all the WDK fixes into the ConnectX branch and then advertise on the ofw mailing list the availability of a WDK build branch along with
> 
>   1) how to build in the WDK environment,
>      which version of the WDK is required + a URL link where to get the WDK.
> 
>   2) An explanation of why and how the __ptr64 attributes were removed along with how
>      others should correct their codes containing __ptr64 attributes.
> 
>   3) updates to the WinOF wiki page describing how to build in the WDK env.
> 
> Let this branch exist for one week, receiving feedback from the list and then merge into the mainline trunk.
> 
> Using this approach is certainly community friendly and may prevent developer surprises.
> 
> ConnectX branch availability dates plus when the actual WDK fixes would be merged into the mainline trunk would be published beforehand.
> 
> 
> Thanks for your consideration,
> 
> Stan.
> 
> 
> 
> _______________________________________________
> ofw mailing list
> ofw at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20080501/7db56aec/attachment.html>


More information about the ofw mailing list