[ofw] WinOF 2.1 RC 5 available for download
Smith, Stan
stan.smith at intel.com
Tue Sep 22 10:21:30 PDT 2009
WinOF 2.1 Release Candidate #5 (RC5) is available @ http://www.openfabrics.org/downloads/WinOF/v2.1-RC5/
*** WinOF is now installed into %ProgramFiles%\WinOF on all OS variants & architectures!
Changes in RC5 from RC4
-----------------------
SVN Commits: 2390 ... 2450
Revision: 2450
Author: stansmith
Date: 4:07:14 PM, Friday, September 18, 2009
Message:
[DAPL2] wait for async processing thread to actually exit.
DAPL doesn't actually wait for the async processing thread to exit before
allowing the library to close. It will wait up to 10 seconds, which under
heavy load isn't enough time. Since the thread is created by an application
level thread, it will continue to run as long as the application runs. But
if the application closes the library, then all library data and code is
invalid, which can result in the thread running
something that's not library code and accessing freed memory.
With this change, I was able to run MPI ping-pong, 16 ranks on a single
system (scm provider) without crashes 1300 times.
Signed-off-by: Sean Hefty <sean.hefty at intel.com>
----
Modified : /gen1/branches/WOF2-1/ulp/dapl2/dapl/openib_cma/device.c
Modified : /gen1/branches/WOF2-1/ulp/dapl2/dapl/openib_scm/device.c
Revision: 2449
Author: stansmith
Date: 4:04:33 PM, Friday, September 18, 2009
Message:
[DAPL2] add cleanup/release code for timer thread dapl_set_timer() creates
a thread to process timers for dat_ep_connect but provides no mechanism to
destroy/exit during dapl library unload. Timers are initialized in library
init code and should be released in the fini code. Add a dapl_timer_release
call to the dapl_fini function to check state of timer thread and destroy
before exiting.
Signed-off-by: Arlin Davis <arlin.r.davis at intel.com>
----
Modified : /gen1/branches/WOF2-1/ulp/dapl2/dapl/common/dapl_timer_util.c
Modified : /gen1/branches/WOF2-1/ulp/dapl2/dapl/common/dapl_timer_util.h
Modified : /gen1/branches/WOF2-1/ulp/dapl2/dapl/udapl/dapl_init.c
Revision: 2448
Author: stansmith
Date: 4:02:31 PM, Friday, September 18, 2009
Message:
[WinVerbs] fix crash accessing freed memory from async thread
If an application exits while asynchronous accept processing is queued,
it's possible for the async processing to access the IbCmId after it has
been freed. A similar problem to this was fixed that dealt with accessing
the verbs QP handle.
A simpler, more generic solution to this problem is to handle application
exit in the same manner as device removal, and lock the winverb provider
lookup lists with exclusive access. Asynchronous operations that are in
process will run to completion, and future operations will be blocked until
the provider cleanup thread has completed. Once they run, they will fail
to acquire a reference on the desired object, which should result in a
Graceful failure.
This avoids more complicated locking to use handles belonging to the lower
level code. If a reference on an object can be acquired, the handle will
be available for use until the reference is released. To handle IB CM
callbacks, additional state checking is required to avoid processing
CM events when we're trying to destroy the endpoint.
Signed-off-by: Sean Hefty <sean.hefty at intel.com>
----
Modified : /gen1/branches/WOF2-1/core/winverbs/kernel/wv_ep.c
Modified : /gen1/branches/WOF2-1/core/winverbs/kernel/wv_provider.c
Modified : /gen1/branches/WOF2-1/core/winverbs/kernel/wv_qp.c
Revision: 2447
Author: stansmith
Date: 2:24:01 PM, Friday, September 18, 2009
Message:
[WinOF] detect possible ConnectX HCA driver load failure and suggest examination of system event
log to ascertain if invalid firmware is an issue.
----
Modified : /gen1/branches/WOF2-1/WinOF/WIX/CustomActions.vbs
Modified : /gen1/trunk/WinOF/WIX/CustomActions.vbs
Revision: 2439
Author: stansmith
Date: 10:29:29 AM, Wednesday, September 16, 2009
Message:
[IBAL] use non-pageable memory to prevent possible problems on power down.
IBAL uses pageable memory to create PnP context. It can create possible problems
in power down flows at the time of system contention. We saw a similar case at a customer.
There is no strong evidence that this is what influenced, but with this patch IBAL will be
more safe and at no cost.
WinOF 2.1 testing has demonstrated that with this patch, infrequent (1 out of 10) power-down
BSOD have disappeared.
Found by Hobin Lee (Xsigo), signed off by Leo.
----
Modified : /gen1/branches/WOF2-1/core/al/kernel/al_pnp.c
Revision: 2428
Author: stansmith
Date: 4:07:05 PM, Tuesday, September 15, 2009
Message:
[WinOF] Streamline WinOF uninstall such that is plays nicely with MSFT PNP.
1) allow PNP to remove .inf referenced files and cleanup driver store.
2) shutdown ND & WSD prior to PNP device removal.
3) remove stale code which checks for OpenIB installs and forces a reboot
----
Modified : /gen1/branches/WOF2-1/WinOF/WIX/CustomActions.vbs
Modified : /gen1/branches/WOF2-1/WinOF/WIX/common/InstallExecuteSeq.inc
Revision: 2427
Author: stansmith
Date: 4:02:17 PM, Tuesday, September 15, 2009
Message:
[DAPL2+Winverbs] use private heaps for debug + local control.
----
Modified : /gen1/branches/WOF2-1/core/winverbs/user/wv_main.cpp
Modified : /gen1/branches/WOF2-1/core/winverbs/user/wv_memory.h
Modified : /gen1/branches/WOF2-1/ulp/dapl2/dapl/openib_scm/cm.c
Modified : /gen1/branches/WOF2-1/ulp/dapl2/dapl/udapl/windows/dapl_osd.c
Modified : /gen1/branches/WOF2-1/ulp/dapl2/dapl/udapl/windows/dapl_osd.h
Modified : /gen1/branches/WOF2-1/ulp/dapl2/dat/udat/windows/dat_osd.c
Modified : /gen1/branches/WOF2-1/ulp/dapl2/dat/udat/windows/dat_osd.h
Modified : /gen1/branches/WOF2-1/ulp/libibverbs/src/ibv_main.cpp
Modified : /gen1/branches/WOF2-1/ulp/libibverbs/src/ibverbs.h
Modified : /gen1/branches/WOF2-1/ulp/librdmacm/src/cma.h
Modified : /gen1/branches/WOF2-1/ulp/librdmacm/src/cma_main.cpp
Revision: 2426
Author: stansmith
Date: 2:23:13 PM, Wednesday, September 09, 2009
Message:
[WinOF] allow 64-bit installer, remove win64=no and default to Product/Platform
specification.
----
Modified : /gen1/branches/WOF2-1/WinOF/WIX/common/OpenSM_service.inc
Revision: 2425
Author: stansmith
Date: 2:20:34 PM, Wednesday, September 09, 2009
Message:
[WinOF] Correct typo RemoveShorcutFolder --> ProgramMenuDir so ProgramMenu always
gets deleted.
----
Modified : /gen1/branches/WOF2-1/WinOF/WIX/common/Docs.inc
Revision: 2424
Author: stansmith
Date: 2:18:00 PM, Wednesday, September 09, 2009
Message:
[WinOF] remove redundant Root spec as TARGETDIR implies '%WindowsVolume%\'
----
Modified : /gen1/branches/WOF2-1/WinOF/WIX/win7/ia64/wof.wxs
Modified : /gen1/branches/WOF2-1/WinOF/WIX/win7/x64/wof.wxs
Modified : /gen1/branches/WOF2-1/WinOF/WIX/win7/x86/wof.wxs
Modified : /gen1/branches/WOF2-1/WinOF/WIX/wlh/ia64/wof.wxs
Modified : /gen1/branches/WOF2-1/WinOF/WIX/wlh/x64/wof.wxs
Modified : /gen1/branches/WOF2-1/WinOF/WIX/wlh/x86/wof.wxs
Modified : /gen1/branches/WOF2-1/WinOF/WIX/wnet/ia64/wof.wxs
Modified : /gen1/branches/WOF2-1/WinOF/WIX/wnet/x64/wof.wxs
Modified : /gen1/branches/WOF2-1/WinOF/WIX/wnet/x86/wof.wxs
Modified : /gen1/branches/WOF2-1/WinOF/WIX/wxp/x86/wof.wxs
Revision: 2418
Author: stansmith
Date: 9:44:42 AM, Wednesday, September 02, 2009
Message:
[IBBUS,COMPLIB] Eliminate re-initialization of the stop lock. Crash reported
upon running "System Common Scenario" WHQL test with our stack.
The crash: C4 (0xd7), which means Driver Verifier revealed a re-initializing
of Remove Lock.
Signed-off by Leonid Keller leonid at mellanox.co.il
----
Modified : /gen1/branches/WOF2-1/core/bus/kernel/bus_pnp.c
Modified : /gen1/branches/WOF2-1/core/complib/kernel/cl_pnp_po.c
Revision: 2402
Author: stansmith
Date: 3:00:33 PM, Tuesday, September 01, 2009
Message:
[WinOF] Shutdown NetworkDirect and Winsock direct before DIFxApp removes devices.
Makes sure no lingering device references are on the IB stack which would prevent
components from being removed.
Moved ND/WSD shutdown into separate CustomAction called before MsiProcessDevices.
----
Modified : /gen1/branches/WOF2-1/WinOF/WIX/CustomActions.vbs
Modified : /gen1/branches/WOF2-1/WinOF/WIX/common/InstallExecuteSeq.inc
Revision: 2401
Author: stansmith
Date: 2:56:34 PM, Tuesday, September 01, 2009
Message:
[DAPL2] udapl/scm: convert error code into dapl error code
Intel MPI checks the uDAPL error code when calling dat_psp_create() to see if
the port number that it provides is in use or not. Convert winsock error codes
to unix errno values.
This fixes the following error reported by Intel MPI:
'DAPL provider is not found and fallback device is not enabled'
Signed-off-by: Sean Hefty <sean.hefty at intel.com>
----
Modified : /gen1/branches/WOF2-1/ulp/dapl2/dapl/openib_scm/cm.c
Revision: 2400
Author: stansmith
Date: 2:53:35 PM, Tuesday, September 01, 2009
Message:
[WINMAD] winmad: allocate registration struct from NonPagedPool.
Apparently data structures that are accessed from within MAD callbacks must be
allocated from NonPagedPool. Allocated the WM_REGISTRATION structure from non
paged pool.
Signed-off-by: Sean Hefty <sean.hefty at intel.com>
----
Modified : /gen1/branches/WOF2-1/core/winmad/kernel/wm_reg.c
Revision: 2399
Author: stansmith
Date: 6:07:33 PM, Friday, August 28, 2009
Message:
[WinOF] Install Librdmacm.dll in a consistent place for all installs (%windir%).
After 2.1, explore installing .dll into [SYSTEM] folder.
----
Modified : /gen1/branches/WOF2-1/WinOF/WIX/common/winverbs_OFED.inc
Revision: 2398
Author: stansmith
Date: 6:03:54 PM, Friday, August 28, 2009
Message:
[WinOF] Add WinOF to Command Window name to distinguish it from other
Command Windows as Svr 2008 likes to add recently used commands to the start menu.
signed off by stan.smith at intel.com
----
Modified : /gen1/branches/WOF2-1/WinOF/WIX/common/Docs.inc
Revision: 2397
Author: stansmith
Date: 6:01:14 PM, Friday, August 28, 2009
Message:
[WINVERBS] should have been pat of Revision: 2391; DllMain is called multiple times
for a given process. Prevent double initialization of critical sections by only
initializing it during process attach. This avoids corrupting the critical section
while it may be in use.
Signed-off-by: Sean Hefty <sean.hefty at intel.com>
----
Modified : /gen1/branches/WOF2-1/ulp/librdmacm/src/cma_main.cpp
Revision: 2396
Author: stansmith
Date: 5:58:34 PM, Friday, August 28, 2009
Message:
[WINVERBS] winverbs: fix race in async connect handling. If an application calls
Connect or Accept, their IRP is queued to a work queue for asynchronous processing.
However, if the application crashes or exits before the work queue can process
the IRP, the cleanup code will call WvEpFree(). This destroys the IbCmId.
When the work queue finally runs, it can access a freed IbCmId. This is bad.
A similar race exists with the QP and the asynchronous disconnect processing.
The disconnect processing can access a the hVerbsQp handle after it has been destroyed.
Additionally, in all three cases, the IRPs assume that the WV provider is able
to process IRPs. Specifically, they require that the index tables maintained by
the provider are still valid. References must be held on the WV provider until
the IRPs finish their processing to ensure this.
Fix invalid accesses to the IbCmId and hVerbsQp handles by locking around their
use after valid state checks. In the case of the QP, we add a guarded mutex for
synchronization purposes and use that in place where the PD mutex had been used.
Signed-off-by: Sean Hefty <sean.hefty at intel.com>
----
Modified : /gen1/branches/WOF2-1/core/winverbs/kernel/wv_ep.c
Modified : /gen1/branches/WOF2-1/core/winverbs/kernel/wv_ep.h
Modified : /gen1/branches/WOF2-1/core/winverbs/kernel/wv_qp.c
Modified : /gen1/branches/WOF2-1/core/winverbs/kernel/wv_qp.h
Revision: 2395
Author: stansmith
Date: 5:52:02 PM, Friday, August 28, 2009
Message:
[WINVERBS] To help match memory allocations with free, replace ExFreePool with ExFreePoolWithTag.
Signed-off-by: Sean Hefty <sean.hefty at intel.com>
----
Modified : /gen1/branches/WOF2-1/core/winmad/kernel/wm_driver.c
Modified : /gen1/branches/WOF2-1/core/winmad/kernel/wm_reg.c
Modified : /gen1/branches/WOF2-1/core/winverbs/kernel/wv_cq.c
Modified : /gen1/branches/WOF2-1/core/winverbs/kernel/wv_srq.c
Modified : /gen1/branches/WOF2-1/etc/kernel/work_queue.c
Revision: 2394
Author: stansmith
Date: 5:44:58 PM, Friday, August 28, 2009
Message:
[WINVERBS] Endpoints are not maintained in a list associated with a provider.
The list entry for an endpoint is used to track connection requests with listens.
When an endpoint is unassociated from a listen, it is removed from the listen list.
Trying to remove it from a list during provider cleanup results in a duplicate removal,
can corrupt the listen list, and may access freed memory.
Signed-off-by: Sean Hefty <sean.hefty at intel.com>
----
Modified : /gen1/branches/WOF2-1/core/winverbs/kernel/wv_provider.c
Revision: 2393
Author: stansmith
Date: 5:42:29 PM, Friday, August 28, 2009
Message:
[WINVERBS] The winverbs PD structure contains both an event and a guarded mutex.
Both must be allocated as part of resident memory, or vague system corruptions may occur
if their memory is paged out. The fix is to allocate the PD structure from NonPagedPool.
Signed-off-by: Sean Hefty <sean.hefty at intel.com>
----
Modified : /gen1/branches/WOF2-1/core/winverbs/kernel/wv_pd.c
Revision: 2392
Author: stansmith
Date: 5:39:47 PM, Friday, August 28, 2009
Message:
[WINVERBS] Fix a memory leak. We need to free the port array, which is allocated
separately from the device structure.
Signed-off-by: Sean Hefty <sean.hefty at intel.com>
----
Modified : /gen1/branches/WOF2-1/core/winverbs/kernel/wv_device.c
Revision: 2391
Author: stansmith
Date: 5:37:08 PM, Friday, August 28, 2009
Message:
[WINVERBS] DllMain is called multiple times for a given process.
Prevent double initialization of critical sections
by only initializing it during process attach.
This avoids corrupting the critical section while it may be in use.
Signed-off-by: Sean Hefty <sean.hefty at intel.com>
----
Modified : /gen1/branches/WOF2-1/ulp/dapl2/dapl/openib_cma/cm.c
Modified : /gen1/branches/WOF2-1/ulp/libibverbs/src/ibv_main.cpp
Revision: 2390
Author: stansmith
Date: 5:33:41 PM, Friday, August 28, 2009
Message:
[WinOF] All installs now install into 'Program Files' and not 'Program Files (x86)'.
Cleanup references to \Program Files (x86)\WinOF.
----
Modified : /gen1/branches/WOF2-1/WinOF/WIX/HPC/cert-add.bat
Modified : /gen1/branches/WOF2-1/WinOF/WIX/README_release.txt
Modified : /gen1/branches/WOF2-1/WinOF/WIX/Release_notes.htm
Modified : /gen1/branches/WOF2-1/WinOF/WIX/dat.conf
Modified : /gen1/branches/WOF2-1/WinOF/WIX/ia64/Command Window.lnk
Modified : /gen1/branches/WOF2-1/WinOF/WIX/win7/ia64/wof.wxs
Modified : /gen1/branches/WOF2-1/WinOF/WIX/win7/x64/wof.wxs
Modified : /gen1/branches/WOF2-1/WinOF/WIX/wlh/ia64/wof.wxs
Modified : /gen1/branches/WOF2-1/WinOF/WIX/wlh/x64/wof.wxs
Modified : /gen1/branches/WOF2-1/WinOF/WIX/wnet/ia64/wof.wxs
Modified : /gen1/branches/WOF2-1/WinOF/WIX/wnet/x64/wof.wxs
Modified : /gen1/branches/WOF2-1/WinOF/WIX/x64/Command Window.lnk
Modified : /gen1/branches/WOF2-1/ulp/dapl2/test/dapltest/scripts/dt-cli.bat
Modified : /gen1/branches/WOF2-1/ulp/dapl2/test/dapltest/scripts/dt-svr.bat
More information about the ofw
mailing list