[ewg] [GIT PULL ofed-1.5.2] iw_cxgb3 critical bug fixes
Steve Wise
swise at opengridcomputing.com
Sat Aug 28 07:14:40 PDT 2010
Hey Vlad,
Please pull these two critical bug fixes for iw_cxgb3 from:
ssh://vlad@sofa.openfabrics.org/~swise/scm/ofed_kernel.git ofed_1_5
The are needed in ofed-1.5.2 for large cluster stability.
Thanks
Steve.
--------
commit d3a225ba07fdd7bf4eccaab383818dde41231166
Author: Steve Wise <swise at opengridcomputing.com>
Date: Sat Aug 28 08:07:38 2010 -0500
RDMA/cxgb3: Remove BUG_ON() on CQ rearm failure
Failure to rearm a CQ means the cxgb3 device is wedged, but we
shouldn't
kill the whole system with a BUG_ON() if this happens.
Signed-off-by: Steve Wise <swise at opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd at cisco.com>
commit f3495ab83c7906ada0fea19867c16c634014e160
Author: Steve Wise <swise at opengridcomputing.com>
Date: Sat Aug 28 08:02:42 2010 -0500
RDMA/cxgb3: Don't exceed the max HW CQ depth.
From: Steve Wise <swise at opengridcomputing.com>
The max depth supported by T3 is 64K entries.
This bug will cause stalls and possibly crashes in large MPI clusters.
Signed-off-by: Steve Wise <swise at opengridcomputing.com>
More information about the ewg
mailing list