[ofw] partial review of mlx4 branch

Fab Tillier ftillier at microsoft.com
Tue Oct 18 07:54:28 PDT 2011


Fab Tillier wrote on Tue, 18 Oct 2011 at 07:27:17

> Leonid Keller wrote on Tue, 18 Oct 2011 at 06:26:13
> 
>> PSB
>> 
>> -----Original Message-----
>> From: Fab Tillier [mailto:ftillier at microsoft.com]
>> Sent: Monday, October 17, 2011 3:05 AM
>> 
>> Hi Leonid,
>> 
>> You can use driver verifier to track memory allocations (including
>> dereference after free and IRQL enforcement) for kernel driver, and app
>> verifier for user-mode DLLs and applications.
>> 
>> [LK] It started from that that Verifier didn't print anything in Win8 -
>> seems like out of some bug. In Win7 it printed the pools that have
>> unreleased buffers, but it doesn't help, because all our allocation we
>> do via one function.
> 
> Right, because you abstract the memory allocation calls, rather than using the
> built-in tools.  That's my point - eliminate the allocations.

This was supposed to say "eliminate the abstractions"...

> This also allows you to use different pool tags.
> 
>> Added mechanism prints the callers of unreleased buffers.
>> We found several leakages with its help.
>> 
>> For IBAT, I believe all IPHelper calls should be done in the kernel, so
>> that path records don't need to be sent to user-mode unless the user
>> needs to inspect/modify the contents.  To that end, I think connection
>> establishment IOCTLs should take as input the source and destination IP
>> address and be able to figure out the rest.  This would allow the
>> kernel drivers to react appropriately to such requests as needed for
>> their transport, be it IB or RoCE. An IOCTL interface for this is much
>> better, really - think of WinVerbs exactly as you do of IBAL.  Both
>> have a connect IOCTL.
>> 
>> [LK] The best place for IBAT service is bus driver - it is always working.
>> But I'm not sure it can use IPHelper.
> 
> IPHelper is available to kernel callers, it can't be called at DISPATCH_LEVEL.
> 
>> All other drivers can be present or not.
>> I'd suggest to use IBAT_EX so far till someone has time to develop it in
> kernel.
> 
> I'll work on this and submit a patch at some point.  I have something
> rudimentary prototyped but need to polish it up.  How strongly do people
> feel about manipulating path records in user-mode? My current direction has
> the paths entirely managed in the kernel (including caching), since path
> records are IB-specific and don't apply to RoCE or iWARP (yes, I'm still hoping
> some iWARP vendors join the project, <sigh>).
> 
> -Fab
> 
>> Cheers,
>> -Fab
>> 
>> Leonid Keller wrote on Sun, 16 Oct 2011 at 15:24:10
>> 
>>> Hi Sean,
>>> 
>>> Thank you for the comments.
>>> I'm going to answer them one-by-one in another mail.
>>> For now - some general notes.
>>> 
>>> 1. Winverbs I don't think I broke IBA. Please, show me where, if I
>>> missed something. Increasing of the version is needed to tell about
>>> new field 'Transport'. As to dependency on IBAL/COMPLIB dlls: It's
>>> really IBAT_EX is dependent on them, while Winverbs is dependent on
>>> IBAT_EX. The latter is needed to add a seamless support to RoCE.
>>> Winverbs.dll is supplied as a part of OFED suit; it was dependent on
>>> kernel IBAT service anyway and I don't think it is too bad if it will
>>> be dependent on other dlls of the suit. From the other side, it's not
>>> good when different user space components use IBAT service as they
>>> want. Because the implementation of the service can change. We really
>>> came to the idea of IBAT_EX.dll, which hides the implementation of
>>> IBAT service, because of the problems of current implementation. In
>>> OFED stack it is implemented inside IPoIB driver, supports only IB
>>> transport and is present once per machine. We need it to support two
>>> transports today - IB and RoCE - and more in the future. We need to
>>> deal with situation when there is no IPoIB driver loaded. We need to
>>> support configurations where several HCA cards with several transports
>>> are working simultaneously. That's why we developed IBAT_EX and
>>> changed applications to use it. You may change WinVerbs.dll back and
>>> implement RoCE support inside of it. You may remove complib dependency
>>> from IBAT_EX, but it will still need IBAL. (One can also replace
>>> calling ibal.dll functions by sending ioctls, but I personally do not
>>> like the idea).
>>> 
>>> 2.Complib
>>> I extended complib memory tracking mechanism to be able see memory
>>> leakage printed.
>>> Could you suggest, how can I do it using standard system functions ?
>>> 
>>> 3. IBA
>>> What IBAs were broken ?
>>> How do you suggest to extend functionality while keeping IBA intact ?
>>> Or I misunderstood your idea ?
>>> 
>>> 
>>> 
>>> -----Original Message-----
>>> From: ofw-bounces at lists.openfabrics.org [mailto:ofw-
>>> bounces at lists.openfabrics.org] On Behalf Of Hefty, Sean
>>> Sent: Tuesday, October 11, 2011 8:38 PM
>>> To: ofw_list
>>> Subject: [ofw] partial review of mlx4 branch
>>> 
>>> See below for comments on the changes in branches/mlx4 compared to the
>>> trunk.  Hopefully all of my comments are marked with 'SH:'.  I did not
>>> review the hw subdirectories.  The changes there are extensive.
>>> 
>>> The biggest concerns from my personal perspective were:
>>> 
>>> * winverbs cannot depend on the ibal or complib libraries
>>> * ibverbs must maintain binary compatibility with existing applications
>>> * we must support a mix of old and new libraries
>>> 
>>> The biggest concern that I believe OFA should have is:
>>> 
>>> * Binary compatibility with existing applications must be maintained.
>>>   This includes all library interfaces as well as the user to kernel ABI.
>>> - Sean
>>> 
>>> 
>>> 
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/core/al/kernel/al_cm_cep.c
>>> branches\mlx4/core/al/kernel/al_cm_cep.c ---
>>> trunk/core/al/kernel/al_cm_cep.c	2011-09-13 09:15:33.785667700 -0700
>>> +++ branches\mlx4/core/al/kernel/al_cm_cep.c	2011-10-10
>>> 16:59:00.857865300 -0700 @@ -677,6 +677,16 @@ __reject_req(
>>>  	p_mad->timeout_ms = 0;
>>>  	p_mad->resp_expected = FALSE;
>>> +	/* Switch src and dst in GRH */
>>> +	if(p_mad->grh_valid)
>>> +	{
>>> +		ib_gid_t dest_gid = {0};
>>> 
>>> SH: no need to initialize
>>> 
>>> +		memcpy(&dest_gid, &p_mad->p_grh->src_gid,
>>> sizeof(ib_gid_t));
>>> +		memcpy(&p_mad->p_grh->src_gid, &p_mad->p_grh-
>>>> dest_gid, sizeof(ib_gid_t));
>>> +		memcpy(&p_mad->p_grh->dest_gid, &dest_gid,
>>> sizeof(ib_gid_t));
>>> +	}
>>> +
>>>  	__cep_send_mad( p_port_cep, p_mad );
>>>  
>>>  	AL_EXIT( AL_DBG_CM ); @@ -3390,7 +3400,7 @@ __cep_queue_mad( 	//
>>>  TODO: Remove - manage above core kernel CM code 	/* NDI connection
>>>  request case */ 	if ( p_cep->state == CEP_STATE_LISTEN &&
>>> -		(p_cep->sid & ~0x0ffffffI64) ==
>>> IB_REQ_CM_RDMA_SID_PREFIX )
>>> +		(p_cep->sid & IB_REQ_CM_RDMA_SID_PREFIX_MASK) ==
>>> IB_REQ_CM_RDMA_SID_PREFIX )
>>>  	{ /* Try to complete pending IRP, if any */ 	mad_cm_req_t* p_req =
>>>  (mad_cm_req_t*)ib_get_mad_buf( p_mad ); 	ib_cm_rdma_req_t *p_rdma_req
>>>  = (ib_cm_rdma_req_t
>>> *)p_req->pdata;
>>> @@ -3401,7 +3411,7 @@ __cep_queue_mad(
>>>  			 (p_rdma_req->ipv != 0x40 && p_rdma_req->ipv != 0x60) ) 		{
>>>  			AL_PRINT_EXIT( TRACE_LEVEL_ERROR,
>>> AL_DBG_ERROR,
>>> -				("NDI connection req is rejected:
>>> maj_min_ver %d, ipv %#x \n",
>>> +				("RDMA CM connection req is rejected:
>>> maj_min_ver %d, ipv %#x \n",
>>>  				p_rdma_req->maj_min_ver, p_rdma_req-
>>>  ipv ) );
>>>  			return IB_UNSUPPORTED;
>>>  		}
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/core/al/kernel/al_ndi_cm.c
>>> branches\mlx4/core/al/kernel/al_ndi_cm.c ---
>>> trunk/core/al/kernel/al_ndi_cm.c	2011-09-13 09:15:33.836672800 -0700
>>> +++ branches\mlx4/core/al/kernel/al_ndi_cm.c	2011-10-10
>>> 16:59:00.909870500 -0700 @@ -461,7 +461,8 @@ static VOID
>>> __ndi_acquire_lock(
>>>  	nd_csq_t *p_ndi_csq = (nd_csq_t*)Csq;
>>>  
>>>  	KeAcquireSpinLock( &p_ndi_csq->lock, pIrql );
>>> -} +}
>>> +
>>> 
>>>  #ifdef NTDDI_WIN8
>>>  static IO_CSQ_RELEASE_LOCK __ndi_release_lock;
>>> @@ -1111,7 +1112,7 @@ __ndi_fill_cm_req(
>>> 
>>>  	memset( p_cm_req, 0, sizeof(*p_cm_req) ); -	p_cm_req- service_id =
>>>  IB_REQ_CM_RDMA_SID_PREFIX | (p_req- prot << 16) | p_req- dst_port;
>>>  +	p_cm_req->service_id = ib_cm_rdma_sid( p_req->prot, p_req- dst_port
>>>  ); 	p_cm_req->p_primary_path = p_path_rec;
>>>  
>>>  	p_cm_req->qpn = qpn; @@ -1964,9 +1965,12 @@ ndi_listen_cm(
>>>  	p_csq->state = NDI_CM_LISTEN; 	__ndi_release_lock( &p_csq-
>>> csq, irql );
>>> -	if( (p_listen->svc_id & 0xFFFF) == 0 )
>>> +	if( ib_cm_rdma_sid_port( p_listen->svc_id ) == 0 )
>>>  	{
>>> -		p_listen->svc_id |= (USHORT)cid | (USHORT)(cid >> 16);
>>> +		p_listen->svc_id = ib_cm_rdma_sid(
>>> +			ib_cm_rdma_sid_protocol( p_listen->svc_id ),
>>> +			(USHORT)cid | (USHORT)(cid >> 16)
>>> +			);
>>>  	}
>>>  
>>>  	ib_status = al_cep_listen( h_al, cid, p_listen );
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/core/al/kernel/al_pnp.c
>>> branches\mlx4/core/al/kernel/al_pnp.c ---
>>> trunk/core/al/kernel/al_pnp.c	2011-09-13 09:15:33.881677300 -0700 +++
>>> branches\mlx4/core/al/kernel/al_pnp.c	2011-10-10 16:59:00.957875300
>>> -0700 @@ -1438,6 +1438,10 @@ __pnp_check_ports(
>>>  			( (p_new_port_attr->link_state == IB_LINK_ARMED) ||
>>>  			(p_new_port_attr->link_state == IB_LINK_ACTIVE) )
>> ) 		{
>>> +
>>> +			AL_PRINT( TRACE_LEVEL_INFORMATION,
>>> AL_DBG_PNP,
>>> +				("pkey or gid changes\n") );
>>> +
>>>  			/* A different number of P_Keys indicates a change.*/ 			if(
>>>  p_old_port_attr->num_pkeys != p_new_port_attr->num_pkeys ) 	{ @@
>>>  -1486,6 +1490,8 @@ __pnp_check_ports( 		if( (p_old_port_attr- lid !=
>>>  p_new_port_attr->lid) || 			(p_old_port_attr->lmc !=
>>>  p_new_port_attr->lmc) ) 		{
>>> +			AL_PRINT( TRACE_LEVEL_INFORMATION,
>>> AL_DBG_PNP,
>>> +				("lid/lmc changed \n") );
>>>  			event_rec.pnp_event = IB_PNP_LID_CHANGE;
>>>  			__pnp_process_port_forward( &event_rec ); 	} @@ -1493,6 +1499,8 @@
>>>  __pnp_check_ports( 		if( (p_old_port_attr->sm_lid !=
>>>  p_new_port_attr->sm_lid) || 			(p_old_port_attr- sm_sl !=
>>>  p_new_port_attr- sm_sl) ) 		{
>>> +			AL_PRINT( TRACE_LEVEL_INFORMATION,
>>> AL_DBG_PNP,
>>> +				("sm_lid/sm_sl changed \n") );
>>>  			event_rec.pnp_event = IB_PNP_SM_CHANGE;
>>>  			__pnp_process_port_forward( &event_rec ); 	} @@ -1500,6 +1508,8 @@
>>>  __pnp_check_ports( 		if( p_old_port_attr->subnet_timeout !=
>>>  			p_new_port_attr->subnet_timeout ) 		{
>>> +			AL_PRINT( TRACE_LEVEL_INFORMATION,
>>> AL_DBG_PNP,
>>> +				("subnet_timeout changed \n") );
>>>  			event_rec.pnp_event = IB_PNP_SUBNET_TIMEOUT_CHANGE;
>>>  			__pnp_process_port_forward( &event_rec );
>> 	}
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/core/al/kernel/al_proxy.c
>>> branches\mlx4/core/al/kernel/al_proxy.c ---
>>> trunk/core/al/kernel/al_proxy.c	2011-09-13 09:15:34.109700100 -0700
>>> +++ branches\mlx4/core/al/kernel/al_proxy.c	2011-10-10
>>> 16:59:01.211900700 -0700 @@ -424,6 +424,10 @@ proxy_pnp_port_cb(
>>> 
>>>  	AL_ENTER( AL_DBG_PROXY_CB );
>>> +	AL_PRINT( TRACE_LEVEL_INFORMATION, AL_DBG_PNP,
>>> +		("p_pnp_rec->pnp_event = 0x%x (%s)\n",
>>> +		p_pnp_rec->pnp_event, ib_get_pnp_event_str( p_pnp_rec-
>>>  pnp_event )) ); + 	p_context = p_pnp_rec->pnp_context;
>>>  
>>>  	/*
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/core/al/kernel/al_sa_req.c
>>> branches\mlx4/core/al/kernel/al_sa_req.c ---
>>> trunk/core/al/kernel/al_sa_req.c	2011-09-13 09:15:33.980687200 -0700
>>> +++ branches\mlx4/core/al/kernel/al_sa_req.c	2011-10-10
>>> 16:59:01.096889200 -0700 @@ -234,26 +234,42 @@ sa_req_mgr_pnp_cb(
>>>  	sa_req_svc_t				*p_sa_req_svc;
>>>  	ib_av_attr_t				av_attr;
>>>  	ib_pd_handle_t				h_pd;
>>> -	ib_api_status_t				status; + 	ib_api_status_t				status =
>>> IB_SUCCESS; +	ib_pnp_port_rec_t			*p_port_rec =
>>> (ib_pnp_port_rec_t*)p_pnp_rec;
>>> 
>>>  	AL_ENTER( AL_DBG_SA_REQ );
>>>  	CL_ASSERT( p_pnp_rec );
>>>  	CL_ASSERT( p_pnp_rec->pnp_context == &gp_sa_req_mgr->obj );
>>> +	AL_PRINT( TRACE_LEVEL_INFORMATION, AL_DBG_PNP,
>>> +		("p_pnp_rec->pnp_event = 0x%x (%s)\n",
>>> +		p_pnp_rec->pnp_event, ib_get_pnp_event_str( p_pnp_rec-
>>>  pnp_event )) ); + 	/* Dispatch based on the PnP event type. */
>>>  	switch( p_pnp_rec->pnp_event ) 	{ 	case
> IB_PNP_PORT_ADD:
>>> -		status = create_sa_req_svc(
>>> (ib_pnp_port_rec_t*)p_pnp_rec );
>>> +		if ( p_port_rec->p_port_attr->transport ==
>>> RDMA_TRANSPORT_RDMAOE )
>>> +		{	// RoCE port
>>> +			AL_PRINT( TRACE_LEVEL_WARNING,
>>> AL_DBG_ERROR,
>>> +				("create_sa_req_svc is not called for RoCE
>>> port %d\n", p_port_rec->p_port_attr->port_num ) );
>>> 
>>> SH: Please change the print from warning / error to indicate that this is the
>>> normal behavior.
>>> 
>>> +		}
>>> +		else
>>> +		{
>>> +			status = create_sa_req_svc( p_port_rec );
>>>  		if( status != IB_SUCCESS )
>>>  		{
>>>  			AL_PRINT( TRACE_LEVEL_ERROR, AL_DBG_ERROR,
>>> -				("create_sa_req_svc failed: %s\n",
>>> ib_get_err_str(status)) );
>>> +					("create_sa_req_svc for port %d
>>> failed: %s\n",
>>> +					p_port_rec->p_port_attr-
>>>  port_num, ib_get_err_str(status)) ); +			} 	}
> 		break;
>>> 
>>>  	case IB_PNP_PORT_REMOVE:
>>> -		CL_ASSERT( p_pnp_rec->context );
>>> +		// context will be NULL for RoCE port
>>> +		if ( !p_pnp_rec->context )
>>> +			break;
>>> 
>>> SH: Move this check to the top of the function to avoid duplicating
>>> it.  If the context is set by IB_PNP_PORT_ADD, just add that to the
>>> check.
>>> 
>>>  		p_sa_req_svc = p_pnp_rec->context;
>>>  		ref_al_obj( &p_sa_req_svc->obj );
>>>  		p_sa_req_svc->obj.pfn_destroy( &p_sa_req_svc->obj, NULL
>>> );
>>> @@ -263,15 +279,15 @@ sa_req_mgr_pnp_cb(
>>> 
>>>  	case IB_PNP_PORT_ACTIVE:
>>>  	case IB_PNP_SM_CHANGE:
>>> -		CL_ASSERT( p_pnp_rec->context );
>>> +		// context will be NULL for RoCE port
>>> +		if ( !p_pnp_rec->context )
>>> +			break;
>>>  		AL_PRINT( TRACE_LEVEL_INFORMATION, AL_DBG_SA_REQ,
>>>  			("updating SM information\n") );
>>>  
>>>  		p_sa_req_svc = p_pnp_rec->context;
>>> -		p_sa_req_svc->sm_lid =
>>> -			((ib_pnp_port_rec_t*)p_pnp_rec)->p_port_attr-
>>>> sm_lid;
>>> -		p_sa_req_svc->sm_sl =
>>> -			((ib_pnp_port_rec_t*)p_pnp_rec)->p_port_attr-
>>>> sm_sl;
>>> +		p_sa_req_svc->sm_lid = p_port_rec->p_port_attr->sm_lid;
>>> +		p_sa_req_svc->sm_sl = p_port_rec->p_port_attr->sm_sl;
>>> 
>>>  		/* Update the address vector. */
>>>  		status = ib_query_av( p_sa_req_svc->h_av, &av_attr, &h_pd
>>> );
>>> @@ -298,7 +314,9 @@ sa_req_mgr_pnp_cb(
>>>  	case IB_PNP_PORT_INIT:
>>>  	case IB_PNP_PORT_ARMED:
>>>  	case IB_PNP_PORT_DOWN:
>>> -		CL_ASSERT( p_pnp_rec->context );
>>> +		// context will be NULL for RoCE port
>>> +		if ( !p_pnp_rec->context )
>>> +			break;
>>>  		p_sa_req_svc = p_pnp_rec->context;
>>>  		p_sa_req_svc->sm_lid = 0;
>>>  		p_sa_req_svc->sm_sl = 0;
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/core/al/kernel/al_smi.c
>>> branches\mlx4/core/al/kernel/al_smi.c ---
>>> trunk/core/al/kernel/al_smi.c	2011-09-13 09:15:33.822671400 -0700 +++
>>> branches\mlx4/core/al/kernel/al_smi.c	2011-10-10 16:59:00.895869100
>>> -0700 @@ -905,7 +905,10 @@ __complete_send_mad(
>>> 
>>>  	/* Construct a send work completion. */
>>>  	cl_memclr( &wc, sizeof( ib_wc_t ) );
>>> -	wc.wr_id	= p_mad_wr->send_wr.wr_id;
>>> +	if (p_mad_wr) {
>>> +		// Handling the special race where p_mad_wr that comes
>>> from spl_qp can be NULL
>>> +		wc.wr_id	= p_mad_wr->send_wr.wr_id;
>>> +	}
>>> 
>>> SH: Please provide more details on why this can happen.  I'm not asking
>>> for a code comment, just a response.  It may make sense to apply this
>>> change separate, so someone can find the details in the change log.
>>> 
>>>  	wc.wc_type	= IB_WC_SEND;
>>>  	wc.status	= wc_status;
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/core/bus/kernel/bus_pnp.c
>>> branches\mlx4/core/bus/kernel/bus_pnp.c ---
>>> trunk/core/bus/kernel/bus_pnp.c	2011-09-13 09:15:31.623451500 -0700
>>> +++ branches\mlx4/core/bus/kernel/bus_pnp.c	2011-10-10
>>> 16:58:57.978577400 -0700 @@ -44,7 +44,6 @@
>>>  #include "bus_port_mgr.h" #include "bus_iou_mgr.h" #include
>>>  "complib/cl_memory.h" -#include "al_cm_cep.h" #include "al_mgr.h"
>>>  #include "bus_ev_log.h" @@ -52,7 +51,6 @@ #include "rdma/verbs.h"
>>>  #include "iba/ib_al_ifc.h" #include "iba/ib_ci_ifc.h" -#include
>>>  "iba/ib_cm_ifc.h" #include "al_cm_cep.h" #include "al_mgr.h" #include
>>>  "bus_ev_log.h"
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/core/bus/kernel/bus_port_mgr.c
>>> branches\mlx4/core/bus/kernel/bus_port_mgr.c ---
>>> trunk/core/bus/kernel/bus_port_mgr.c	2011-09-13 09:15:31.580447200
>>> -0700 +++ branches\mlx4/core/bus/kernel/bus_port_mgr.c	2011- 10-10
>>> 16:58:57.922571800 -0700 @@ -772,6 +772,15 @@ port_mgr_port_add(
>>>  	}
>>>  
>>>  	/*
>>> +	 * Don't create PDO for IPoIB (and start IPoIB) while over a RoCE port.
>>> +	 */
>>> +	if ( p_pnp_rec->p_port_attr->transport != RDMA_TRANSPORT_IB ){
>>> +		BUS_TRACE_EXIT( BUS_DBG_PNP,("IPoIb is not started for
>>> RoCE port. %s ca_guid %I64x port(%d)\n",
>>> +								p_bfi-
>>>> whoami, p_bfi->ca_guid, p_pnp_rec->p_port_attr->port_num));
>>> +		return IB_SUCCESS;
>>> +	}
>>> +
>>> +	/*
>>>  	 * Allocate a PNP context for this object. pnp_rec.context is obj
>>>  unique. 	 */ 	if ( !p_ctx ) {
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/core/bus/kernel/SOURCES
>>> branches\mlx4/core/bus/kernel/SOURCES ---
>>> trunk/core/bus/kernel/SOURCES	2011-09-13 09:15:31.567445900 -0700 +++
>>> branches\mlx4/core/bus/kernel/SOURCES	2011-10-10 16:58:57.908570400
>>> -0700 @@ -17,7 +17,7 @@ SOURCES= ibbus.rc		\
>>>  	bus_iou_mgr.c		\
>>>  	bus_stat.c
>>> -
>>> INCLUDES=..\..\..\inc;..\..\..\inc\kernel;..\..\al;..\..\al\kernel;..\..\
>>> bus\kernel\ $O;
>>> +INCLUDES=..\..\..\inc;..\..\..\inc\kernel;..\..\al;..\..\al\kernel;..\..
>>> \bus\kerne l\$O;..\..\..\hw\mlx4\inc;..\..\..\inc\kernel\iba;
>>> 
>>>  C_DEFINES=$(C_DEFINES) -DDRIVER -DDEPRECATE_DDK_FUNCTIONS -
>>> DNEED_CL_OBJ
>>> 
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/core/complib/cl_memory.c
>>> branches\mlx4/core/complib/cl_memory.c ---
>>> trunk/core/complib/cl_memory.c	2011-09-13 09:15:30.739363100 -0700 +++
>>> branches\mlx4/core/complib/cl_memory.c	2011-10-10 16:58:57.086488200
>>> -0700 @@ -92,6 +92,7 @@ __cl_mem_track_start( void )
>>>  	if( status != CL_SUCCESS ) 	{ 		__cl_free_priv( gp_mem_tracker );
>>>  +		gp_mem_tracker = NULL; 		return; 	} } @@ -179,8 +180,15 @@
>>>  cl_mem_display( void ) 		 */ 		p_hdr = PARENT_STRUCT( p_map_item,
>>>  cl_malloc_hdr_t,
>>> map_item );
>>> 
>>> -		cl_msg_out( "\tMemory block at %p allocated in file %s line %d\n",
>>> -			p_hdr->p_mem, p_hdr->file_name, p_hdr-
>>>> line_num );
>>> +#ifdef CL_KERNEL
>>> +		DbgPrintEx(DPFLTR_IHVNETWORK_ID,
>>> DPFLTR_ERROR_LEVEL,
>>> +			"\tMemory block for '%s' at %p of size %#x allocated
>>> in file %s line %d\n",
>>> +			(p_hdr->tag == NULL) ? "Unknown" : p_hdr->tag,
>>> +			p_hdr->p_mem, p_hdr->size, p_hdr->file_name,
>>> p_hdr->line_num );
>>> +#else
>>> +		cl_msg_out( "\tMemory block at %p of size %#x allocated in
>>> file %s line %d\n",
>>> +			p_hdr->p_mem, p_hdr->size, p_hdr->file_name,
>>> p_hdr->line_num );
>>> +#endif
>>> 
>>>  		p_map_item = cl_qmap_next( p_map_item ); 	} @@ -189,18 +197,21 @@
>>>  cl_mem_display( void ) } + /*
>>>   * Allocates memory and stores information about the allocation in a list.
>>>   * The contents of the list can be printed out by calling the function
>>>   * "MemoryReportUsage".  Memory allocation will succeed even if the list
>>>   * cannot be created.
>>>   */
>>> +static
>>>  void*
>>> -__cl_malloc_trk(
>>> +__cl_malloc_trk_internal(
>>>  	IN	const char* const	p_file_name,
>>>  	IN	const int32_t		line_num,
>>>  	IN	const size_t		size,
>>> -	IN	const boolean_t		pageable )
>>> +	IN	const boolean_t		pageable,
>>> +	IN	const char*			tag )
>>>  { 	cl_malloc_hdr_t	*p_hdr; 	cl_list_item_t	*p_list_item; @@ -264,6
>>>  +275,8 @@ __cl_malloc_trk( 	 * not in the list without dereferencing
>>>  memory we do not own. 	 */ 	p_hdr->p_mem = p_mem;
>>> +	p_hdr->size = (uint32_t)size;
>>> +	p_hdr->tag = (char*)tag;
>>> 
>>>  	/* Insert the header structure into our allocation list. */
>>>  	cl_qmap_insert( &gp_mem_tracker->alloc_map, (uintptr_t)p_mem,
>>> &p_hdr->map_item );
>>> @@ -272,6 +285,34 @@ __cl_malloc_trk(
>>>  	return( p_mem );
>>>  }
>>> +/* + * Allocates memory and stores information about the allocation
>>> in a list. + * The contents of the list can be printed out by calling
>>> the function + * "MemoryReportUsage".  Memory allocation will succeed
>>> even if the list + * cannot be created. + */ +void* +__cl_malloc_trk(
>>> +	IN	const char* const	p_file_name, +	IN	const int32_t		line_num,
>>> +	IN	const size_t		size, +	IN	const boolean_t		pageable ) +{ +	return
>>> __cl_malloc_trk_internal( p_file_name, +		line_num, size, pageable,
>>> NULL ); +} + +void* +__cl_malloc_trk_ex( +	IN	const char*
>>> const	p_file_name, +	IN	const int32_t		line_num, +	IN	const
>>> size_t		size, +	IN	const boolean_t		pageable, +	IN	const char*			tag )
>>> +{ +	return __cl_malloc_trk_internal( p_file_name, +		line_num, size,
>>> pageable, tag ); +}
>>> 
>>>  /*
>>>   * Allocate non-tracked memory.
>>> @@ -301,6 +342,22 @@ __cl_zalloc_trk(
>>>  	return( p_buffer );
>>>  }
>>> +void*
>>> +__cl_zalloc_trk_ex(
>>> +	IN	const char* const	p_file_name,
>>> +	IN	const int32_t		line_num,
>>> +	IN	const size_t		size,
>>> +	IN	const boolean_t		pageable,
>>> +	IN	const char*			tag )
>>> +{
>>> +	void	*p_buffer;
>>> +
>>> +	p_buffer = __cl_malloc_trk_ex( p_file_name, line_num, size,
>>> pageable, tag );
>>> +	if( p_buffer )
>>> +		cl_memclr( p_buffer, size );
>>> +
>>> +	return( p_buffer );
>>> +}
>>> 
>>> SH: We need to stop abstracting memory allocations.  There are already
>>> tools available for tracking memory allocations, especially for the
>>> kernel.
>>> 
>>>  void*
>>>  __cl_zalloc_ntrk(
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/core/complib/cl_memtrack.h
>>> branches\mlx4/core/complib/cl_memtrack.h ---
>>> trunk/core/complib/cl_memtrack.h	2011-09-13 09:15:30.757364900 -0700
>>> +++ branches\mlx4/core/complib/cl_memtrack.h	2011-10-10
>>> 16:58:57.108490400 -0700 @@ -76,6 +76,8 @@ typedef struct
> _cl_malloc_hdr
>>>  	void				*p_mem;
>>>  	char				file_name[FILE_NAME_LENGTH];
>>>  	int32_t				line_num;
>>> +	int32_t				size;
>>> +	char 				*tag;
>>> 
>>>  } cl_malloc_hdr_t;
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/core/dirs branches\mlx4/core/dirs ---
>>> trunk/core/dirs	2011-09-13 09:15:40.866375700 -0700 +++
>>> branches\mlx4/core/dirs	2011-10-10 16:59:07.969576400 -0700 @@ -4,5
>>> +4,6 @@ DIRS=\
>>>  	bus			\ 	iou			\ 	ibat
>> 		\ +	ibat_ex   \ 	winverbs	\ 	winmad
>>> Only in branches\mlx4/core: ibat_ex diff -up -r -X
>>> \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I '\$Id'
>>> trunk/core/winverbs/kernel/wv_device.c
>>> branches\mlx4/core/winverbs/kernel/wv_device.c ---
>>> trunk/core/winverbs/kernel/wv_device.c	2011-09-13 09:15:26.752964500
>>> -0700 +++ branches\mlx4/core/winverbs/kernel/wv_device.c	2011- 10-10
>>> 16:58:53.501129700 -0700 @@ -135,10 +135,12 @@ static void
>>> WvDeviceEventHandler(ib_even
>>>> 			WvDeviceCompleteRequests(&dev->pPorts[i], STATUS_SUCCESS, event);
>>>> 		} 	} else { +		if(pEvent->port_number <= dev->PortCount) {
>>>> 		WvDeviceCompleteRequests(&dev->pPorts[pEvent-
>> port_number - 1],
>>> 
>>> STATUS_SUCCESS, event);
>>>  	}
>>>  }
>>> +}
>>> 
>>> SH: This check is not needed.  Upper level drivers must be able to
>>> trust that the lower drivers will not give them completely bogus data.
>>> 
>>>  static WV_DEVICE *WvDeviceAlloc(WV_PROVIDER *pProvider) { @@ - 216,6
>>>  +218,8 @@ static NTSTATUS WvDeviceCreatePorts(WV_D 	return
>>>  STATUS_NO_MEMORY; 	}
>>> +	ASSERT(ControlDevice != NULL);
>>> +
>>>  	WDF_IO_QUEUE_CONFIG_INIT(&config, WdfIoQueueDispatchManual); 	for (i
>>>  = 0; i < pDevice->PortCount; i++) { 		pDevice->pPorts[i].Flags = 0;
>>>  @@ -537,8 +541,8 @@ static void WvConvertPortAttr(WV_IO_PORT
>>>  	pAttributes->ActiveWidth	= pPortAttr->active_width;
>>>  	pAttributes->ActiveSpeed	= pPortAttr->active_speed;
>>>  	pAttributes->PhysicalState	= pPortAttr->phys_state;
>>>  +	pAttributes->Transport		= (UINT8) pPortAttr->transport;
>>>  	pAttributes->Reserved[0]	= 0;
>>> -	pAttributes->Reserved[1]	= 0;
>>> 
>>> SH: This is fine, but user space must still support older kernels
>>> which set this field to 0.
>>> 
>>>  }
>>>  
>>>  void WvDeviceQuery(WV_PROVIDER *pProvider, WDFREQUEST Request)
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/core/winverbs/kernel/wv_driver.c
>>> branches\mlx4/core/winverbs/kernel/wv_driver.c ---
>>> trunk/core/winverbs/kernel/wv_driver.c	2011-09-13 09:15:26.724961700
>>> -0700 +++ branches\mlx4/core/winverbs/kernel/wv_driver.c	2011- 10-10
>>> 16:58:53.417121300 -0700 @@ -31,8 +31,10 @@
>>>  #include <wdf.h>
>>>  #include <wdmsec.h>
>>>  #include <ntstatus.h>
>>> +#include <initguid.h>
>>> 
>>>  #include "index_list.c" +#include <rdma/verbs.h> #include "wv_driver.h"
>>>  #include "wv_ioctl.h" #include "wv_provider.h" @@ -44,10 +46,6 @@
>>>  #include "wv_qp.h" #include "wv_ep.h"
>>> -#include <initguid.h>
>>> -#include <rdma/verbs.h>
>>> -#include <iba/ib_cm_ifc.h>
>>> -
>>> 
>>> SH: These changes are not necessary, and we need ib_cm_ifc.h, so it
>>> should explicitly be included.
>>> 
>>>  WDF_DECLARE_CONTEXT_TYPE_WITH_NAME(WV_RDMA_DEVICE,
>>> WvRdmaDeviceGetContext)
>>> 
>>>  WDFDEVICE				ControlDevice = NULL;
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/core/winverbs/user/SOURCES
>>> branches\mlx4/core/winverbs/user/SOURCES ---
>>> trunk/core/winverbs/user/SOURCES	2011-09-13 09:15:27.658055000 -0700
>>> +++ branches\mlx4/core/winverbs/user/SOURCES	2011-10-10
>>> 16:58:54.372216800 -0700 @@ -29,7 +29,15 @@ INCLUDES =
>>> ..;..\..\..\inc;..\..\..\inc\
>>> 
>>>  USER_C_FLAGS = $(USER_C_FLAGS) -DEXPORT_WV_SYMBOLS
>>> +!if !$(FREEBUILD)
>>> +C_DEFINES=$(C_DEFINES) -D_DEBUG -DDEBUG -DDBG
>>> +!endif
>>> +
>>>  TARGETLIBS = \
>>>  	$(SDK_LIB_PATH)\kernel32.lib	\
>>>  	$(SDK_LIB_PATH)\uuid.lib		\
>>> -	$(SDK_LIB_PATH)\ws2_32.lib
>>> +	$(SDK_LIB_PATH)\ws2_32.lib      \
>>> +	$(SDK_LIB_PATH)\iphlpapi.lib 	\
>>> +	$(TARGETPATH)\*\ibat_ex.lib     \
>>> +        $(TARGETPATH)\*\ibal.lib        \
>>> +        $(TARGETPATH)\*\complib.lib
>>> 
>>> SH: No.  Winverbs should not depend on ibal or complib.  It shouldn't
>>> even depend on ibat_ex if that can be helped.  It should be as
>>> stand-alone as possible to make it easier to use.  This is why winverbs
>>> sent IOCTLs directly to the kernel for translations.
>>> 
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/core/winverbs/user/wv_provider.cpp
>>> branches\mlx4/core/winverbs/user/wv_provider.cpp ---
>>> trunk/core/winverbs/user/wv_provider.cpp	2011-09-13 09:15:27.769066100
>>> -0700 +++ branches\mlx4/core/winverbs/user/wv_provider.cpp	2011- 10-10
>>> 16:58:54.380217600 -0700 @@ -35,6 +35,7 @@
>>>  #include "wv_device.h"
>>>  #include "wv_ep.h"
>>>  #include "wv_ioctl.h"
>>> +#include <iba/ibat_ex.h>
>>> 
>>>  CWVProvider::CWVProvider() { @@ -136,42 +137,14 @@ out: STDMETHODIMP
>>>  CWVProvider:: TranslateAddress(const SOCKADDR* pAddress,
>>>  WV_DEVICE_ADDRESS* pDeviceAddress) {
>>> -	HANDLE hIbat;
>>> -	IOCTL_IBAT_IP_TO_PORT_IN addr;
>>>  	IBAT_PORT_RECORD port;
>>> -	DWORD bytes; -	HRESULT hr; - -	hIbat = CreateFileW(IBAT_WIN32_NAME,
>>> GENERIC_READ | GENERIC_WRITE, - 	FILE_SHARE_READ | FILE_SHARE_WRITE,
>>> NULL, -						OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL); -	if (hIbat
>>> == INVALID_HANDLE_VALUE) { -		return
>>> HRESULT_FROM_WIN32(GetLastError()); -	} - -	addr.Version =
>>> IBAT_IOCTL_VERSION; -	if (pAddress- sa_family == AF_INET) {
>>> -		addr.Address.IpVersion = 4; - 	RtlCopyMemory(addr.Address.Address +
>>> 12, - 			  &((SOCKADDR_IN *)pAddress)-
>>>> sin_addr, 4);
>>> -	} else { -		addr.Address.IpVersion = 6;
>>> -		RtlCopyMemory(addr.Address.Address, - 	  &((SOCKADDR_IN6
>>> *)pAddress)-
>>>> sin6_addr, 16);
>>> -	}
>>> -
>>> -	if (DeviceIoControl(hIbat, IOCTL_IBAT_IP_TO_PORT,
>>> -						&addr, sizeof addr, &port,
>>> sizeof port, &bytes, NULL)) {
>>> -		hr = WV_SUCCESS;
>>> +	HRESULT hr = IBAT_EX::IpToPort( pAddress, &port );
>>> +	if ( FAILED( hr ) )
>>> +		return hr;
>>>  		pDeviceAddress->DeviceGuid = port.CaGuid;
>>>  		pDeviceAddress->Pkey = port.PKey;
>>>  		pDeviceAddress->PortNumber = port.PortNum;
>>> -	} else {
>>> -		hr = HRESULT_FROM_WIN32(GetLastError());
>>> -	}
>>> -
>>> -	CloseHandle(hIbat);
>>> -	return hr;
>>> +	return WV_SUCCESS;
>>>  }
>>>  
>>>  STDMETHODIMP CWVProvider::
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/core/winverbs/wv_ioctl.h
>>> branches\mlx4/core/winverbs/wv_ioctl.h ---
>>> trunk/core/winverbs/wv_ioctl.h	2011-09-13 09:15:28.105099700 -0700 +++
>>> branches\mlx4/core/winverbs/wv_ioctl.h	2011-10-10 16:58:54.643243900
>>> -0700 @@ -436,7 +436,8 @@ typedef struct _WV_IO_PORT_ATTRIBUTES
>>>  	UINT8			ActiveWidth;
>>>  	UINT8			ActiveSpeed;
>>>  	UINT8			PhysicalState;
>>> -	UINT8			Reserved[2];
>>> +	UINT8			Transport;
>>> +	UINT8			Reserved[1];
>>> 
>>>  }	WV_IO_PORT_ATTRIBUTES;
>>> SH: Again, fine, but we need to handle the case where it is not set by
>>> an older driver.  (I'm writing this as I'm reviewing the code, so it may
>>> be handled below.)
>>> 
>>> 
>>> <snip hw/mlx4 diffs>
>>> 
>>> 
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/hw/mthca/kernel/hca_pnp.c
>>> branches\mlx4/hw/mthca/kernel/hca_pnp.c ---
>>> trunk/hw/mthca/kernel/hca_pnp.c	2011-09-13 09:16:17.408029500 -0700
>>> +++ branches\mlx4/hw/mthca/kernel/hca_pnp.c	2011-10-10
>>> 16:59:46.774456500 -0700 @@ -12,6 +12,7 @@
>>> 
>>>  #include "hca_driver.h"
>>>  #include "mthca_dev.h"
>>> +#include <rdma\verbs.h>
>>> 
>>>  #if defined(EVENT_TRACING) #ifdef offsetof @@ -21,7 +22,6 @@ #endif
>>>  #include "mthca.h" #include <initguid.h> -#include <rdma\verbs.h>
>>>  #include <wdmguid.h>
>>>  
>>>  extern const char *mthca_version;
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/hw/mthca/kernel/hca_verbs.c
>>> branches\mlx4/hw/mthca/kernel/hca_verbs.c ---
>>> trunk/hw/mthca/kernel/hca_verbs.c	2011-09-13 09:16:17.298018500 -0700
>>> +++ branches\mlx4/hw/mthca/kernel/hca_verbs.c	2011-10-10
>>> 16:59:46.656444700 -0700 @@ -1093,7 +1093,7 @@ mlnx_create_spl_qp (
>>>  	IN		const	uint8_t 	port_num, 	IN 	const 	void 	*qp_context,
>>>  	IN				ci_async_event_cb_t
>>> 	event_handler,
>>> -	IN		const	ib_qp_create_t
>>> 	*p_create_attr,
>>> +	IN	OUT			ib_qp_create_t
>>> 		*p_create_attr,
>>>  		OUT			ib_qp_attr_t 	*p_qp_attr, 	OUT 			ib_qp_handle_t 		*ph_qp ) {
>>>  @@ -1118,7 +1118,7 @@ mlnx_create_qp ( 	IN		const 	ib_pd_handle_t
>>>  	h_pd, 	IN		const	void 	*qp_context, 	IN
>> 		ci_async_event_cb_t
>>> 	event_handler,
>>> -	IN		const	ib_qp_create_t
>>> 	*p_create_attr,
>>> +	IN	OUT			ib_qp_create_t
>>> 		*p_create_attr,
>>>  		OUT			ib_qp_attr_t 	*p_qp_attr, 	OUT 			ib_qp_handle_t 		*ph_qp,
>>>  	IN	OUT			ci_umv_buf_t
>>> 	*p_umv_buf )
>>> @@ -1641,6 +1641,19 @@ mlnx_port_get_transport (
>>>  	UNREFERENCED_PARAMETER(port_num);
>>>  	return RDMA_TRANSPORT_IB;
>>>  }
>>> + +uint8_t +mlnx_get_sl_for_ip_port ( +	IN		const
>>> 	ib_ca_handle_t	h_ca, +	IN	const uint8_t				ca_port_num, + 	IN	const
>>> uint16_t				ip_port_num) +{ +	UNREFERENCED_PARAMETER(h_ca);
>>> +	UNREFERENCED_PARAMETER(ca_port_num);
>>> +	UNREFERENCED_PARAMETER(ip_port_num); +	return 0xff; +}
>> +
>>>  void
>>>  setup_ci_interface(
>>>  	IN		const	ib_net64_t
>>> 	ca_guid,
>>> @@ -1697,7 +1710,7 @@ setup_ci_interface(
>>> 
>>>  	p_interface->local_mad = mlnx_local_mad;
>>>  	p_interface->rdma_port_get_transport = mlnx_port_get_transport;
>>> -
>>> +	p_interface->get_sl_for_ip_port = mlnx_get_sl_for_ip_port;
>>> 
>>>  	mlnx_memory_if(p_interface);
>>>  	mlnx_direct_if(p_interface);
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/hw/mthca/kernel/mthca_provider.c
>>> branches\mlx4/hw/mthca/kernel/mthca_provider.c ---
>>> trunk/hw/mthca/kernel/mthca_provider.c	2011-09-13 09:16:17.761064800
>>> -0700 +++ branches\mlx4/hw/mthca/kernel/mthca_provider.c	2011- 10-10
>>> 16:59:47.018480900 -0700 @@ -766,7 +766,7 @@ static struct ib_cq
>>> *mthca_create_cq(str
>>>  		cq->set_ci_db_index = ucmd.set_db_index;
>>>  		cq->arm_db_index    = ucmd.arm_db_index;
>>>  		cq->u_arm_db_index    = ucmd.u_arm_db_index;
>>> -		cq->p_u_arm_sn = (int*)((char*)u_arm_db_page +
>>> BYTE_OFFSET(ucmd.u_arm_db_page)); +		cq->p_u_arm_sn = (volatile u32
>>> *)((char*)u_arm_db_page + BYTE_OFFSET(ucmd.u_arm_db_page));
>>>  	}
>>>  
>>>  	for (nent = 1; nent <= entries; nent <<= 1)
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/hw/mthca/kernel/mthca_provider.h
>>> branches\mlx4/hw/mthca/kernel/mthca_provider.h ---
>>> trunk/hw/mthca/kernel/mthca_provider.h	2011-09-13 09:16:17.850073700
>>> -0700 +++ branches\mlx4/hw/mthca/kernel/mthca_provider.h	2011- 10-10
>>> 16:59:47.110490100 -0700 @@ -203,7 +203,7 @@ struct mthca_cq {
>>>  	__be32                *arm_db;
>>>  	int                    arm_sn;
>>>  	int                    u_arm_db_index;
>>> -	int                *p_u_arm_sn;
>>> +	volatile u32          *p_u_arm_sn;
>>> 
>>>  	union mthca_buf        queue;
>>>  	struct mthca_mr        mr;
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/hw/mthca/kernel/mthca_qp.c
>>> branches\mlx4/hw/mthca/kernel/mthca_qp.c ---
>>> trunk/hw/mthca/kernel/mthca_qp.c	2011-09-13 09:16:18.478136500 -0700
>>> +++ branches\mlx4/hw/mthca/kernel/mthca_qp.c	2011-10-10
>>> 16:59:47.636542700 -0700 @@ -10,18 +10,18 @@
>>>   * COPYING in the main directory of this source tree, or the
>>>   * OpenIB.org BSD license below:
>>>   *
>>> - *     Redistribution and use in source and binary forms, with or
>>> - *     without modification, are permitted provided that the following
>>> - *     conditions are met:
>>> + *	   Redistribution and use in source and binary forms, with or
>>> + *	   without modification, are permitted provided that the following
>>> + *	   conditions are met:
>>>   *
>>> - *      - Redistributions of source code must retain the above
>>> - *        copyright notice, this list of conditions and the following
>>> - *        disclaimer.
>>> + *		- Redistributions of source code must retain the above
>>> + *		  copyright notice, this list of conditions and the following
>>> + *		  disclaimer.
>>>   *
>>> - *      - Redistributions in binary form must reproduce the above
>>> - *        copyright notice, this list of conditions and the following
>>> - *        disclaimer in the documentation and/or other materials
>>> - *        provided with the distribution.
>>> + *		- Redistributions in binary form must reproduce the above
>>> + *		  copyright notice, this list of conditions and the following
>>> + *		  disclaimer in the documentation and/or other materials
>>> + *		  provided with the distribution.
>>>   * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>>>   * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE
>> WARRANTIES
>>> OF
>>> @@ -53,36 +53,36 @@
>>> 
>>>  enum {
>>>  	MTHCA_MAX_DIRECT_QP_SIZE = 4 * PAGE_SIZE,
>>> -	MTHCA_ACK_REQ_FREQ       = 10,
>>> -	MTHCA_FLIGHT_LIMIT       = 9,
>>> -	MTHCA_UD_HEADER_SIZE     = 72, /* largest UD header possible */
>>> +	MTHCA_ACK_REQ_FREQ		 = 10,
>>> +	MTHCA_FLIGHT_LIMIT		 = 9,
>>> +	MTHCA_UD_HEADER_SIZE	 = 72, /* largest UD header possible
>>> */
>>>  	MTHCA_INLINE_HEADER_SIZE = 4,  /* data segment overhead for inline
>>>  */ 	MTHCA_INLINE_CHUNK_SIZE  = 16  /* inline data segment chunk */
>> };
>>> 
>>>  enum {
>>> -	MTHCA_QP_STATE_RST  = 0,
>>> +	MTHCA_QP_STATE_RST	= 0,
>>>  	MTHCA_QP_STATE_INIT = 1,
>>> -	MTHCA_QP_STATE_RTR  = 2,
>>> -	MTHCA_QP_STATE_RTS  = 3,
>>> -	MTHCA_QP_STATE_SQE  = 4,
>>> -	MTHCA_QP_STATE_SQD  = 5,
>>> -	MTHCA_QP_STATE_ERR  = 6,
>>> +	MTHCA_QP_STATE_RTR	= 2,
>>> +	MTHCA_QP_STATE_RTS	= 3,
>>> +	MTHCA_QP_STATE_SQE	= 4,
>>> +	MTHCA_QP_STATE_SQD	= 5,
>>> +	MTHCA_QP_STATE_ERR	= 6,
>>>  	MTHCA_QP_STATE_DRAINING = 7
>>>  };
>>>  
>>>  enum {
>>> -	MTHCA_QP_ST_RC 	= 0x0,
>>> -	MTHCA_QP_ST_UC 	= 0x1,
>>> -	MTHCA_QP_ST_RD 	= 0x2,
>>> -	MTHCA_QP_ST_UD 	= 0x3,
>>> +	MTHCA_QP_ST_RC	= 0x0,
>>> +	MTHCA_QP_ST_UC	= 0x1,
>>> +	MTHCA_QP_ST_RD	= 0x2,
>>> +	MTHCA_QP_ST_UD	= 0x3,
>>>  	MTHCA_QP_ST_MLX = 0x7
>>>  };
>>>  
>>>  enum {
>>>  	MTHCA_QP_PM_MIGRATED = 0x3,
>>> -	MTHCA_QP_PM_ARMED    = 0x0,
>>> -	MTHCA_QP_PM_REARM    = 0x1
>>> +	MTHCA_QP_PM_ARMED	 = 0x0,
>>> +	MTHCA_QP_PM_REARM	 = 0x1
>>>  };
>>>  
>>>  enum { @@ -105,24 +105,24 @@ enum { #pragma pack(push,1) struct
>>>  mthca_qp_path { 	__be32 port_pkey;
>>> -	u8     rnr_retry;
>>> -	u8     g_mylmc;
>>> +	u8	   rnr_retry;
>>> +	u8	   g_mylmc;
>>>  	__be16 rlid;
>>> -	u8     ackto;
>>> -	u8     mgid_index;
>>> -	u8     static_rate;
>>> -	u8     hop_limit;
>>> +	u8	   ackto;
>>> +	u8	   mgid_index;
>>> +	u8	   static_rate;
>>> +	u8	   hop_limit;
>>>  	__be32 sl_tclass_flowlabel;
>>> -	u8     rgid[16];
>>> +	u8	   rgid[16];
>>>  } ;
>>>  
>>>  struct mthca_qp_context {
>>>  	__be32 flags;
>>>  	__be32 tavor_sched_queue; /* Reserved on Arbel */
>>> -	u8     mtu_msgmax;
>>> -	u8     rq_size_stride;	/* Reserved on Tavor */
>>> -	u8     sq_size_stride;	/* Reserved on Tavor */
>>> -	u8     rlkey_arbel_sched_queue;	/* Reserved on Tavor */
>>> +	u8	   mtu_msgmax;
>>> +	u8	   rq_size_stride;	/* Reserved on Tavor */
>>> +	u8	   sq_size_stride;	/* Reserved on Tavor */
>>> +	u8	   rlkey_arbel_sched_queue; /* Reserved on Tavor */
>>>  	__be32 usr_page; 	__be32 local_qpn; 	__be32 remote_qpn; @@ -164,23
>>>  +164,23 @@ struct mthca_qp_param { #pragma pack(pop)
>>>  
>>>  enum {
>>> -	MTHCA_QP_OPTPAR_ALT_ADDR_PATH     = 1 << 0,
>>> -	MTHCA_QP_OPTPAR_RRE               = 1 << 1,
>>> -	MTHCA_QP_OPTPAR_RAE               = 1 << 2,
>>> -	MTHCA_QP_OPTPAR_RWE               = 1 << 3,
>>> -	MTHCA_QP_OPTPAR_PKEY_INDEX        = 1 << 4,
>>> -	MTHCA_QP_OPTPAR_Q_KEY             = 1 << 5,
>>> -	MTHCA_QP_OPTPAR_RNR_TIMEOUT       = 1 << 6,
>>> +	MTHCA_QP_OPTPAR_ALT_ADDR_PATH	  = 1 << 0,
>>> +	MTHCA_QP_OPTPAR_RRE 			  = 1 << 1,
>>> +	MTHCA_QP_OPTPAR_RAE 			  = 1 << 2,
>>> +	MTHCA_QP_OPTPAR_RWE 			  = 1 << 3,
>>> +	MTHCA_QP_OPTPAR_PKEY_INDEX		  = 1 << 4,
>>> +	MTHCA_QP_OPTPAR_Q_KEY			  = 1 << 5,
>>> +	MTHCA_QP_OPTPAR_RNR_TIMEOUT 	  = 1 << 6,
>>>  	MTHCA_QP_OPTPAR_PRIMARY_ADDR_PATH = 1 << 7,
>>> -	MTHCA_QP_OPTPAR_SRA_MAX           = 1 << 8,
>>> -	MTHCA_QP_OPTPAR_RRA_MAX           = 1 << 9,
>>> -	MTHCA_QP_OPTPAR_PM_STATE          = 1 << 10,
>>> -	MTHCA_QP_OPTPAR_PORT_NUM          = 1 << 11,
>>> -	MTHCA_QP_OPTPAR_RETRY_COUNT       = 1 << 12,
>>> -	MTHCA_QP_OPTPAR_ALT_RNR_RETRY     = 1 << 13,
>>> -	MTHCA_QP_OPTPAR_ACK_TIMEOUT       = 1 << 14,
>>> -	MTHCA_QP_OPTPAR_RNR_RETRY         = 1 << 15,
>>> -	MTHCA_QP_OPTPAR_SCHED_QUEUE       = 1 << 16
>>> +	MTHCA_QP_OPTPAR_SRA_MAX 		  = 1 << 8,
>>> +	MTHCA_QP_OPTPAR_RRA_MAX 		  = 1 << 9,
>>> +	MTHCA_QP_OPTPAR_PM_STATE		  = 1 << 10,
>>> +	MTHCA_QP_OPTPAR_PORT_NUM		  = 1 << 11,
>>> +	MTHCA_QP_OPTPAR_RETRY_COUNT 	  = 1 << 12,
>>> +	MTHCA_QP_OPTPAR_ALT_RNR_RETRY	  = 1 << 13,
>>> +	MTHCA_QP_OPTPAR_ACK_TIMEOUT 	  = 1 << 14,
>>> +	MTHCA_QP_OPTPAR_RNR_RETRY		  = 1 << 15,
>>> +	MTHCA_QP_OPTPAR_SCHED_QUEUE 	  = 1 << 16
>>>  };
>>>  
>>>  static const u8 mthca_opcode[] = {
>>> @@ -209,110 +209,110 @@ static void fill_state_table()
>>> 
>>>  	/* IBQPS_RESET */
>>>  	t = &state_table[IBQPS_RESET][0];
>>> -	t[IBQPS_RESET].trans 					=
>>> MTHCA_TRANS_ANY2RST;
>>> -	t[IBQPS_ERR].trans 						=
>>> MTHCA_TRANS_ANY2ERR;
>>> +	t[IBQPS_RESET].trans					=
>>> MTHCA_TRANS_ANY2RST;
>>> +	t[IBQPS_ERR].trans						=
>>> MTHCA_TRANS_ANY2ERR;
>>> 
>>>  	t[IBQPS_INIT].trans 						=
>>> MTHCA_TRANS_RST2INIT; -	t[IBQPS_INIT].req_param[UD]  	=
>>> IB_QP_PKEY_INDEX |IB_QP_PORT |IB_QP_QKEY; -
>>> 	t[IBQPS_INIT].req_param[UC] 	= IB_QP_PKEY_INDEX |IB_QP_PORT
>>> |IB_QP_ACCESS_FLAGS; -	t[IBQPS_INIT].req_param[RC]  	=
>>> IB_QP_PKEY_INDEX |IB_QP_PORT |IB_QP_ACCESS_FLAGS;
>>> -	t[IBQPS_INIT].req_param[MLX]  	= IB_QP_PKEY_INDEX |IB_QP_QKEY; -
>>> 	t[IBQPS_INIT].opt_param[MLX]  	= IB_QP_PORT;
>>> +	t[IBQPS_INIT].req_param[UD] 	= IB_QP_PKEY_INDEX |IB_QP_PORT
>>> |IB_QP_QKEY; +	t[IBQPS_INIT].req_param[UC] 	= IB_QP_PKEY_INDEX
>>> |IB_QP_PORT |IB_QP_ACCESS_FLAGS; + 	t[IBQPS_INIT].req_param[RC] 	=
>>> IB_QP_PKEY_INDEX |IB_QP_PORT |IB_QP_ACCESS_FLAGS;
>>> +	t[IBQPS_INIT].req_param[MLX]	= IB_QP_PKEY_INDEX |IB_QP_QKEY;
>>> +	t[IBQPS_INIT].opt_param[MLX]	= IB_QP_PORT;
>>> 
>>>  	/* IBQPS_INIT */
>>>  	t = &state_table[IBQPS_INIT][0];
>>> -	t[IBQPS_RESET].trans 					=
>>> MTHCA_TRANS_ANY2RST;
>>> -	t[IBQPS_ERR].trans 						=
>>> MTHCA_TRANS_ANY2ERR;
>>> +	t[IBQPS_RESET].trans					=
>>> MTHCA_TRANS_ANY2RST;
>>> +	t[IBQPS_ERR].trans						=
>>> MTHCA_TRANS_ANY2ERR;
>>> 
>>>  	t[IBQPS_INIT].trans 						=
>>> MTHCA_TRANS_INIT2INIT; -	t[IBQPS_INIT].opt_param[UD] 	=
>>> IB_QP_PKEY_INDEX |IB_QP_PORT |IB_QP_QKEY; -
>>> 	t[IBQPS_INIT].opt_param[UC] 	= IB_QP_PKEY_INDEX |IB_QP_PORT
>>> |IB_QP_ACCESS_FLAGS; -	t[IBQPS_INIT].opt_param[RC]  	=
>>> IB_QP_PKEY_INDEX |IB_QP_PORT |IB_QP_ACCESS_FLAGS; -
>>> 	t[IBQPS_INIT].opt_param[MLX]  	= IB_QP_PKEY_INDEX |IB_QP_QKEY; +
>>> 	t[IBQPS_INIT].opt_param[UD] 	= IB_QP_PKEY_INDEX |IB_QP_PORT
>>> |IB_QP_QKEY; + 	t[IBQPS_INIT].opt_param[UC] 	= IB_QP_PKEY_INDEX
>>> |IB_QP_PORT |IB_QP_ACCESS_FLAGS; +	t[IBQPS_INIT].opt_param[RC] 	=
>>> IB_QP_PKEY_INDEX |IB_QP_PORT |IB_QP_ACCESS_FLAGS;
>>> +	t[IBQPS_INIT].opt_param[MLX]	= IB_QP_PKEY_INDEX |IB_QP_QKEY;
>>> 
>>> -	t[IBQPS_RTR].trans 						=
>>> MTHCA_TRANS_INIT2RTR;
>>> -	t[IBQPS_RTR].req_param[UC]  	=
>>> +	t[IBQPS_RTR].trans						=
>>> MTHCA_TRANS_INIT2RTR;
>>> +	t[IBQPS_RTR].req_param[UC]		=
>>>  		IB_QP_AV |IB_QP_PATH_MTU |IB_QP_DEST_QPN
>>> |IB_QP_RQ_PSN;
>>> -	t[IBQPS_RTR].req_param[RC]  	=
>>> +	t[IBQPS_RTR].req_param[RC]		=
>>>  		IB_QP_AV |IB_QP_PATH_MTU |IB_QP_DEST_QPN
>>> |IB_QP_RQ_PSN |IB_QP_MAX_DEST_RD_ATOMIC
>>> |IB_QP_MIN_RNR_TIMER;
>>> -	t[IBQPS_RTR].opt_param[UD]  	= IB_QP_PKEY_INDEX |IB_QP_QKEY;
>>> -	t[IBQPS_RTR].opt_param[UC]  	= IB_QP_PKEY_INDEX
>>> |IB_QP_ALT_PATH |IB_QP_ACCESS_FLAGS;
>>> -	t[IBQPS_RTR].opt_param[RC]  	= IB_QP_PKEY_INDEX
>>> |IB_QP_ALT_PATH |IB_QP_ACCESS_FLAGS;
>>> -	t[IBQPS_RTR].opt_param[MLX]  	= IB_QP_PKEY_INDEX
>>> |IB_QP_QKEY;
>>> +	t[IBQPS_RTR].opt_param[UD]		= IB_QP_PKEY_INDEX
>>> |IB_QP_QKEY;
>>> +	t[IBQPS_RTR].opt_param[UC]		= IB_QP_PKEY_INDEX
>>> |IB_QP_ALT_PATH |IB_QP_ACCESS_FLAGS;
>>> +	t[IBQPS_RTR].opt_param[RC]		= IB_QP_PKEY_INDEX
>>> |IB_QP_ALT_PATH |IB_QP_ACCESS_FLAGS;
>>> +	t[IBQPS_RTR].opt_param[MLX] 	= IB_QP_PKEY_INDEX
>>> |IB_QP_QKEY;
>>> 
>>> -/* IBQPS_RTR */
>>> +/* IBQPS_RTR */
>>>  	t = &state_table[IBQPS_RTR][0];
>>> -	t[IBQPS_RESET].trans 					=
>>> MTHCA_TRANS_ANY2RST;
>>> -	t[IBQPS_ERR].trans 						=
>>> MTHCA_TRANS_ANY2ERR;
>>> +	t[IBQPS_RESET].trans					=
>>> MTHCA_TRANS_ANY2RST;
>>> +	t[IBQPS_ERR].trans						=
>>> MTHCA_TRANS_ANY2ERR;
>>> 
>>> -	t[IBQPS_RTS].trans 						=
>>> MTHCA_TRANS_RTR2RTS;
>>> -	t[IBQPS_RTS].req_param[UD]  	= IB_QP_SQ_PSN;
>>> -	t[IBQPS_RTS].req_param[UC]  	= IB_QP_SQ_PSN;
>>> -	t[IBQPS_RTS].req_param[RC]  	=
>>> +	t[IBQPS_RTS].trans						=
>>> MTHCA_TRANS_RTR2RTS;
>>> +	t[IBQPS_RTS].req_param[UD]		= IB_QP_SQ_PSN;
>>> +	t[IBQPS_RTS].req_param[UC]		= IB_QP_SQ_PSN;
>>> +	t[IBQPS_RTS].req_param[RC]		=
>>>  		IB_QP_TIMEOUT |IB_QP_RETRY_CNT |IB_QP_RNR_RETRY
>>> |IB_QP_SQ_PSN |IB_QP_MAX_QP_RD_ATOMIC;
>>> -	t[IBQPS_RTS].req_param[MLX]  	= IB_QP_SQ_PSN;
>>> -	t[IBQPS_RTS].opt_param[UD]  	= IB_QP_CUR_STATE |IB_QP_QKEY;
>>> -	t[IBQPS_RTS].opt_param[UC]  	=
>>> +	t[IBQPS_RTS].req_param[MLX] 	= IB_QP_SQ_PSN;
>>> +	t[IBQPS_RTS].opt_param[UD]		= IB_QP_CUR_STATE
>>> |IB_QP_QKEY;
>>> +	t[IBQPS_RTS].opt_param[UC]		=
>>>  		IB_QP_CUR_STATE |IB_QP_ALT_PATH
>>> |IB_QP_ACCESS_FLAGS |IB_QP_PATH_MIG_STATE; -
>>> 	t[IBQPS_RTS].opt_param[RC] 	= 	IB_QP_CUR_STATE |IB_QP_ALT_PATH |
>>> +	t[IBQPS_RTS].opt_param[RC]		=	IB_QP_CUR_STATE
>> |IB_QP_ALT_PATH |
>>>  		IB_QP_ACCESS_FLAGS |IB_QP_MIN_RNR_TIMER
>>> |IB_QP_PATH_MIG_STATE;
>>> -	t[IBQPS_RTS].opt_param[MLX]  	= IB_QP_CUR_STATE
>>> |IB_QP_QKEY;
>>> +	t[IBQPS_RTS].opt_param[MLX] 	= IB_QP_CUR_STATE
>>> |IB_QP_QKEY;
>>> 
>>> -	/* IBQPS_RTS */
>>> +	/* IBQPS_RTS */
>>>  	t = &state_table[IBQPS_RTS][0];
>>> -	t[IBQPS_RESET].trans 					=
>>> MTHCA_TRANS_ANY2RST;
>>> -	t[IBQPS_ERR].trans 						=
>>> MTHCA_TRANS_ANY2ERR;
>>> +	t[IBQPS_RESET].trans					=
>>> MTHCA_TRANS_ANY2RST;
>>> +	t[IBQPS_ERR].trans						=
>>> MTHCA_TRANS_ANY2ERR;
>>> 
>>> -	t[IBQPS_RTS].trans 						= MTHCA_TRANS_RTS2RTS;
>>> -	t[IBQPS_RTS].opt_param[UD]  	= IB_QP_CUR_STATE |IB_QP_QKEY;
>>> -	t[IBQPS_RTS].opt_param[UC]  	= IB_QP_ACCESS_FLAGS |IB_QP_ALT_PATH
>>> |IB_QP_PATH_MIG_STATE; -	t[IBQPS_RTS].opt_param[RC]  	=
>>> 	IB_QP_ACCESS_FLAGS | +	t[IBQPS_RTS].trans 		= MTHCA_TRANS_RTS2RTS;
>>> +	t[IBQPS_RTS].opt_param[UD]		= IB_QP_CUR_STATE |IB_QP_QKEY;
>>> +	t[IBQPS_RTS].opt_param[UC]		= IB_QP_ACCESS_FLAGS |IB_QP_ALT_PATH
>>> |IB_QP_PATH_MIG_STATE; +	t[IBQPS_RTS].opt_param[RC] 	=
>>> 	IB_QP_ACCESS_FLAGS |
>>>  		IB_QP_ALT_PATH |IB_QP_PATH_MIG_STATE
>>> |IB_QP_MIN_RNR_TIMER;
>>> -	t[IBQPS_RTS].opt_param[MLX]  	= IB_QP_CUR_STATE
>>> |IB_QP_QKEY;
>>> +	t[IBQPS_RTS].opt_param[MLX] 	= IB_QP_CUR_STATE
>>> |IB_QP_QKEY;
>>> 
>>> -	t[IBQPS_SQD].trans 						=
>>> MTHCA_TRANS_RTS2SQD;
>>> -	t[IBQPS_SQD].opt_param[UD]  	=
>>> IB_QP_EN_SQD_ASYNC_NOTIFY;
>>> -	t[IBQPS_SQD].opt_param[UC]  	=
>>> IB_QP_EN_SQD_ASYNC_NOTIFY;
>>> -	t[IBQPS_SQD].opt_param[RC]  	=
>>> 	IB_QP_EN_SQD_ASYNC_NOTIFY;
>>> -	t[IBQPS_SQD].opt_param[MLX]  	=
>>> IB_QP_EN_SQD_ASYNC_NOTIFY;
>>> +	t[IBQPS_SQD].trans						=
>>> MTHCA_TRANS_RTS2SQD;
>>> +	t[IBQPS_SQD].opt_param[UD]		=
>>> IB_QP_EN_SQD_ASYNC_NOTIFY;
>>> +	t[IBQPS_SQD].opt_param[UC]		=
>>> IB_QP_EN_SQD_ASYNC_NOTIFY;
>>> +	t[IBQPS_SQD].opt_param[RC]		=
>>> 	IB_QP_EN_SQD_ASYNC_NOTIFY;
>>> +	t[IBQPS_SQD].opt_param[MLX] 	=
>>> IB_QP_EN_SQD_ASYNC_NOTIFY;
>>> 
>>> -	/* IBQPS_SQD */
>>> +	/* IBQPS_SQD */
>>>  	t = &state_table[IBQPS_SQD][0];
>>> -	t[IBQPS_RESET].trans 					=
>>> MTHCA_TRANS_ANY2RST;
>>> -	t[IBQPS_ERR].trans 						=
>>> MTHCA_TRANS_ANY2ERR;
>>> +	t[IBQPS_RESET].trans					=
>>> MTHCA_TRANS_ANY2RST;
>>> +	t[IBQPS_ERR].trans						=
>>> MTHCA_TRANS_ANY2ERR;
>>> 
>>> -	t[IBQPS_RTS].trans 						= MTHCA_TRANS_SQD2RTS;
>>> -	t[IBQPS_RTS].opt_param[UD]  	= IB_QP_CUR_STATE |IB_QP_QKEY;
>>> -	t[IBQPS_RTS].opt_param[UC]  	= IB_QP_CUR_STATE |
>>> +	t[IBQPS_RTS].trans						= MTHCA_TRANS_SQD2RTS;
>>> +	t[IBQPS_RTS].opt_param[UD]		= IB_QP_CUR_STATE |IB_QP_QKEY;
>>> +	t[IBQPS_RTS].opt_param[UC]		= IB_QP_CUR_STATE |
>>>  		IB_QP_ALT_PATH |IB_QP_ACCESS_FLAGS
>>> |IB_QP_PATH_MIG_STATE; -	t[IBQPS_RTS].opt_param[RC]  	=
>>> 	IB_QP_CUR_STATE |IB_QP_ALT_PATH | + 	t[IBQPS_RTS].opt_param[RC]		=
>>> 	IB_QP_CUR_STATE |IB_QP_ALT_PATH |
>>>  		IB_QP_ACCESS_FLAGS |IB_QP_MIN_RNR_TIMER
>>> |IB_QP_PATH_MIG_STATE;
>>> -	t[IBQPS_RTS].opt_param[MLX]  	= IB_QP_CUR_STATE
>>> |IB_QP_QKEY;
>>> +	t[IBQPS_RTS].opt_param[MLX] 	= IB_QP_CUR_STATE
>>> |IB_QP_QKEY;
>>> 
>>> -	t[IBQPS_SQD].trans 						=
>>> MTHCA_TRANS_SQD2SQD;
>>> -	t[IBQPS_SQD].opt_param[UD]  	= IB_QP_PKEY_INDEX
>>> |IB_QP_QKEY;
>>> -	t[IBQPS_SQD].opt_param[UC]  	= IB_QP_AV |
>>> 	IB_QP_CUR_STATE |
>>> +	t[IBQPS_SQD].trans						=
>>> MTHCA_TRANS_SQD2SQD;
>>> +	t[IBQPS_SQD].opt_param[UD]		= IB_QP_PKEY_INDEX
>>> |IB_QP_QKEY;
>>> +	t[IBQPS_SQD].opt_param[UC]		= IB_QP_AV |
>>> 	IB_QP_CUR_STATE |
>>>  		IB_QP_ALT_PATH |IB_QP_ACCESS_FLAGS
>>> |IB_QP_PKEY_INDEX |IB_QP_PATH_MIG_STATE; - 	t[IBQPS_SQD].opt_param[RC]
>>> 	= 	IB_QP_AV |IB_QP_TIMEOUT |IB_QP_RETRY_CNT |IB_QP_RNR_RETRY |
>>> +	t[IBQPS_SQD].opt_param[RC]		=	IB_QP_AV |IB_QP_TIMEOUT
>>> |IB_QP_RETRY_CNT |IB_QP_RNR_RETRY |
>>>  		IB_QP_MAX_QP_RD_ATOMIC |IB_QP_MAX_DEST_RD_ATOMIC |IB_QP_CUR_STATE
>>>  |IB_QP_ALT_PATH | 		IB_QP_ACCESS_FLAGS
>> |IB_QP_PKEY_INDEX
>>> |IB_QP_MIN_RNR_TIMER |IB_QP_PATH_MIG_STATE;
>>> -	t[IBQPS_SQD].opt_param[MLX]  	= IB_QP_PKEY_INDEX
>>> |IB_QP_QKEY;
>>> +	t[IBQPS_SQD].opt_param[MLX] 	= IB_QP_PKEY_INDEX
>>> |IB_QP_QKEY;
>>> 
>>> -	/* IBQPS_SQE */
>>> +	/* IBQPS_SQE */
>>>  	t = &state_table[IBQPS_SQE][0];
>>> -	t[IBQPS_RESET].trans 					=
>>> MTHCA_TRANS_ANY2RST;
>>> -	t[IBQPS_ERR].trans 						=
>>> MTHCA_TRANS_ANY2ERR;
>>> +	t[IBQPS_RESET].trans					=
>>> MTHCA_TRANS_ANY2RST;
>>> +	t[IBQPS_ERR].trans						=
>>> MTHCA_TRANS_ANY2ERR;
>>> 
>>> -	t[IBQPS_RTS].trans 						= MTHCA_TRANS_SQERR2RTS;
>>> -	t[IBQPS_RTS].opt_param[UD]  	= IB_QP_CUR_STATE |IB_QP_QKEY;
>>> -	t[IBQPS_RTS].opt_param[UC]  	= IB_QP_CUR_STATE | IB_QP_ACCESS_FLAGS;
>>> -//	t[IBQPS_RTS].opt_param[RC]  	= 	IB_QP_CUR_STATE
>>> |IB_QP_MIN_RNR_TIMER; -	t[IBQPS_RTS].opt_param[MLX] 	= IB_QP_CUR_STATE
>>> |IB_QP_QKEY; +	t[IBQPS_RTS].trans 		= MTHCA_TRANS_SQERR2RTS;
>>> +	t[IBQPS_RTS].opt_param[UD]		= IB_QP_CUR_STATE |IB_QP_QKEY;
>>> +	t[IBQPS_RTS].opt_param[UC]		= IB_QP_CUR_STATE | IB_QP_ACCESS_FLAGS;
>>> +//	t[IBQPS_RTS].opt_param[RC]		=	IB_QP_CUR_STATE
>>> |IB_QP_MIN_RNR_TIMER; +	t[IBQPS_RTS].opt_param[MLX] 	= IB_QP_CUR_STATE
>>> |IB_QP_QKEY;
>>> 
>>> -	/* IBQPS_ERR */
>>> +	/* IBQPS_ERR */
>>>  	t = &state_table[IBQPS_ERR][0];
>>> -	t[IBQPS_RESET].trans 					=
>>> MTHCA_TRANS_ANY2RST;
>>> -	t[IBQPS_ERR].trans 						=
>>> MTHCA_TRANS_ANY2ERR;
>>> +	t[IBQPS_RESET].trans					=
>>> MTHCA_TRANS_ANY2RST;
>>> +	t[IBQPS_ERR].trans						=
>>> MTHCA_TRANS_ANY2ERR;
>>> 
>>>  }; @@ -337,7 +337,7 @@ static void dump_wqe(u32 print_lvl, u32
>>>  	UNUSED_PARAM_WOWPP(qp_ptr);
> 	UNUSED_PARAM_WOWPP(print_lvl);
>>> -	(void) wqe;	/* avoid warning if mthca_dbg compiled away... */
>>> +	(void) wqe; /* avoid warning if mthca_dbg compiled away... */
>>>  	HCA_PRINT(print_lvl,HCA_DBG_QP,("WQE contents  QPN 0x%06x
>>>  \n",qp_ptr->qpn)); 	HCA_PRINT(print_lvl,HCA_DBG_QP,("WQE contents
>>>  [%02x] %08x %08x %08x %08x \n",0 		, cl_ntoh32(wqe[0]),
>>>  cl_ntoh32(wqe[1]), cl_ntoh32(wqe[2]),
>>> cl_ntoh32(wqe[3])));
>>> @@ -367,7 +367,7 @@ static void *get_send_wqe(struct mthca_q
>>>  			(n << qp->sq.wqe_shift);
>>>  	else
>>>  		return (u8*)qp->queue.page_list[(qp->send_wqe_offset +
>>> -					    (n << qp->sq.wqe_shift)) >>
>>> +						(n << qp->sq.wqe_shift)) >>
>>>  					   PAGE_SHIFT].page + 	((qp- send_wqe_offset + (n <<
>>>  qp->sq.wqe_shift)) & 			 (PAGE_SIZE - 1)); @@ - 378,12 +378,12 @@
>>>  static void mthca_wq_init(struct mthca_w 	spin_lock_init(&wq->lock);
>>>  	wq->next_ind  = 0; 	wq->last_comp = wq->max - 1;
>>> -	wq->head      = 0;
>>> -	wq->tail      = 0;
>>> +	wq->head	  = 0;
>>> +	wq->tail	  = 0;
>>>  }
>>>  
>>>  void mthca_qp_event(struct mthca_dev *dev, u32 qpn,
>>> -		    enum ib_event_type event_type, u8 vendor_code)
>>> +			enum ib_event_type event_type, u8 vendor_code)
>>>  { 	struct mthca_qp *qp; 	ib_event_rec_t event; @@ -403,7 +403,7 @@
>>>  void mthca_qp_event(struct mthca_dev *de 	event.type = event_type;
>>>  	event.context = qp->ibqp.qp_context; 	event.vendor_specific =
>>>  vendor_code;
>>> -	HCA_PRINT(TRACE_LEVEL_WARNING,HCA_DBG_QP,("QP %06x
>>> Async event  event_type 0x%x vendor_code 0x%x\n",
>>> +	HCA_PRINT(TRACE_LEVEL_WARNING,HCA_DBG_QP,("QP %06x
>>> Async event	event_type 0x%x vendor_code 0x%x\n",
>>>  		qpn,event_type,vendor_code)); 	qp- ibqp.event_handler(&event); @@
>>>  -421,7 +421,7 @@ static int to_mthca_state(enum ib_qp_sta 	case
>>>  IBQPS_SQD:   return MTHCA_QP_STATE_SQD; 	case IBQPS_SQE: return
>>>  MTHCA_QP_STATE_SQE; 	case IBQPS_ERR:   return
> MTHCA_QP_STATE_ERR;
>>> -	default:                return -1;
>>> +	default:				return -1;
>>>  	} } @@ -445,10 +445,10 @@ static inline enum ib_qp_state to_ib_qp_
>>>  	case MTHCA_QP_STATE_RTR:   return IBQPS_RTR; 	case
>>>  MTHCA_QP_STATE_RTS:   return IBQPS_RTS; 	case MTHCA_QP_STATE_SQD:
>>>  return IBQPS_SQD;
>>> -	case MTHCA_QP_STATE_DRAINING:   return IBQPS_SQD;
>>> +	case MTHCA_QP_STATE_DRAINING:	return IBQPS_SQD;
>>>  	case MTHCA_QP_STATE_SQE:   return IBQPS_SQE;
>>>  	case MTHCA_QP_STATE_ERR:   return IBQPS_ERR;
>>> -	default:                return -1;
>>> +	default:				return -1;
>>>  	} } @@ -480,17 +480,17 @@ static void to_ib_ah_attr(struct mthca_d
>>>  				struct mthca_qp_path *path) { 	memset(ib_ah_attr, 0, sizeof
>>>  *ib_ah_attr);
>>> -	ib_ah_attr->port_num 	  = (u8)((cl_ntoh32(path->port_pkey) >> 24)
>>> & 0x3);
>>> +	ib_ah_attr->port_num	  = (u8)((cl_ntoh32(path->port_pkey) >> 24)
>>> & 0x3);
>>> 
>>>  	if (ib_ah_attr->port_num == 0 || ib_ah_attr->port_num > dev-
>>>  limits.num_ports)
>>>  		return;
>>> -	ib_ah_attr->dlid     	  = cl_ntoh16(path->rlid);
>>> -	ib_ah_attr->sl       	  = (u8)(cl_ntoh32(path->sl_tclass_flowlabel)
>>>>> 28);
>>> +	ib_ah_attr->dlid		  = cl_ntoh16(path->rlid);
>>> +	ib_ah_attr->sl			  = (u8)(cl_ntoh32(path-
>>>  sl_tclass_flowlabel) >> 28); 	ib_ah_attr->src_path_bits =
>>>  path->g_mylmc & 0x7f; 	//TODO: work around: set always full speed 	-
>>>  really, it's much more complicate 	ib_ah_attr->static_rate   = 0;
>>> -	ib_ah_attr->ah_flags      = (path->g_mylmc & (1 << 7)) ? IB_AH_GRH :
>>> 0; +	ib_ah_attr->ah_flags	  = (path->g_mylmc & (1 << 7)) ? IB_AH_GRH :
>> 0;
>>>  	if (ib_ah_attr->ah_flags) { 		ib_ah_attr->grh.sgid_index =
>>>  (u8)(path->mgid_index & (dev- limits.gid_table_len - 1));
>>>  		ib_ah_attr->grh.hop_limit  = path->hop_limit; @@ -540,20 +540,20 @@
>>>  int mthca_query_qp(struct ib_qp *ibqp, s 		goto out_mailbox;
>> 	}
>>> -	qp_param    = mailbox->buf;
>>> -	context     = &qp_param->context;
>>> +	qp_param	= mailbox->buf;
>>> +	context 	= &qp_param->context;
>>>  	mthca_state = cl_ntoh32(context->flags) >> 28;
>>> -	qp->state		     = to_ib_qp_state(mthca_state);
>>> -	qp_attr->qp_state	     = qp->state;
>>> -	qp_attr->path_mtu 	     = context->mtu_msgmax >> 5;
>>> -	qp_attr->path_mig_state      =
>>> +	qp->state			 = to_ib_qp_state(mthca_state);
>>> +	qp_attr->qp_state		 = qp->state;
>>> +	qp_attr->path_mtu		 = context->mtu_msgmax >> 5;
>>> +	qp_attr->path_mig_state 	 =
>>>  		to_ib_mig_state((cl_ntoh32(context->flags) >> 11) & 0x3);
>>> -	qp_attr->qkey 		     = cl_ntoh32(context->qkey); -
>> 	qp_attr->rq_psn
>>>   = cl_ntoh32(context->rnr_nextrecvpsn) & 0xffffff; -	qp_attr-
>> sq_psn
>> 
>>>    = cl_ntoh32(context->next_send_psn) & 0xffffff;
>>> -	qp_attr->dest_qp_num 	     = cl_ntoh32(context-
>>>> remote_qpn) & 0xffffff;
>>> -	qp_attr->qp_access_flags     =
>>> +	qp_attr->qkey			 = cl_ntoh32(context->qkey);
>>> +	qp_attr->rq_psn 		 = cl_ntoh32(context-
>>>  rnr_nextrecvpsn) & 0xffffff; +	qp_attr->sq_psn 		 =
>>>  cl_ntoh32(context- next_send_psn) & 0xffffff; +	qp_attr->dest_qp_num
>>>  	 = cl_ntoh32(context- remote_qpn) & 0xffffff; +	qp_attr-
>>>  qp_access_flags =
>>>  		to_ib_qp_access_flags(cl_ntoh32(context->params2));
>>>  
>>>  	if (qp->transport == RC || qp->transport == UC) { @@ -561,11 +561,11
>>>  @@ int mthca_query_qp(struct ib_qp *ibqp, s 	to_ib_ah_attr(dev,
>>>  &qp_attr->alt_ah_attr, &context- alt_path); 		qp_attr- alt_pkey_index
>>>  = 			(u16)(cl_ntoh32(context->alt_path.port_pkey) &
>>> 0x7f);
>>> -		qp_attr->alt_port_num 	= qp_attr-
>>>  alt_ah_attr.port_num; +		qp_attr->alt_port_num	= qp_attr-
>>>  alt_ah_attr.port_num; 	}
>>>  
>>>  	qp_attr->pkey_index = (u16)(cl_ntoh32(context-
>>>  pri_path.port_pkey) & 0x7f);
>>> -	qp_attr->port_num   =
>>> +	qp_attr->port_num	=
>>>  		(u8)((cl_ntoh32(context->pri_path.port_pkey) >> 24) & 0x3);
>>>  
>>>  	/* qp_attr->en_sqd_async_notify is only applicable in modify qp */
>>> @@ -575,22 +575,23 @@ int mthca_query_qp(struct ib_qp *ibqp, s
>>> 
>>>  	qp_attr->max_dest_rd_atomic =
>>>  		(u8)(1 << ((cl_ntoh32(context->params2) >> 21) & 0x7));
>>> -	qp_attr->min_rnr_timer 	    =
>>> +	qp_attr->min_rnr_timer		=
>>>  		(u8)((cl_ntoh32(context->rnr_nextrecvpsn) >> 24) & 0x1f);
>>> -	qp_attr->timeout 	    = context->pri_path.ackto >> 3;
>>> -	qp_attr->retry_cnt 	    = (u8)((cl_ntoh32(context->params1) >>
>>> 16) & 0x7);
>>> -	qp_attr->rnr_retry 	    = context->pri_path.rnr_retry >> 5;
>>> -	qp_attr->alt_timeout 	    = context->alt_path.ackto >> 3;
>>> +	qp_attr->timeout		= context->pri_path.ackto >> 3;
>>> +	qp_attr->retry_cnt		= (u8)((cl_ntoh32(context->params1)
>>>>> 16) & 0x7);
>>> +	qp_attr->rnr_retry		= context->pri_path.rnr_retry >> 5;
>>> +	qp_attr->alt_timeout		= context->alt_path.ackto >> 3;
>>> 
>>>  done:
>>> -	qp_attr->cur_qp_state	     = qp_attr->qp_state;
>>> -	qp_attr->cap.max_send_wr     = qp->sq.max;
>>> -	qp_attr->cap.max_recv_wr     = qp->rq.max;
>>> -	qp_attr->cap.max_send_sge    = qp->sq.max_gs;
>>> -	qp_attr->cap.max_recv_sge    = qp->rq.max_gs;
>>> +	qp_attr->cur_qp_state		 = qp_attr->qp_state;
>>> +	qp_attr->cap.max_send_wr	 = qp->sq.max;
>>> +	qp_attr->cap.max_recv_wr	 = qp->rq.max;
>>> +	qp_attr->cap.max_send_sge	 = qp->sq.max_gs;
>>> +	qp_attr->cap.max_recv_sge	 = qp->rq.max_gs;
>>>  	qp_attr->cap.max_inline_data = qp->max_inline_data;
>>> -	qp_init_attr->cap	     = qp_attr->cap;
>>> +	qp_init_attr->cap			 = qp_attr->cap;
>>> +	qp_init_attr->sq_sig_type	 = qp->sq_policy;
>>> 
>>>  out_mailbox:
>>>  	mthca_free_mailbox(dev, mailbox);
>>> @@ -619,11 +620,11 @@ static void init_port(struct mthca_dev *
>>> 
>>>  	RtlZeroMemory(&param, sizeof param);
>>> -	param.port_width    = dev->limits.port_width_cap;
>>> -	param.vl_cap    = dev->limits.vl_cap;
>>> -	param.mtu_cap   = dev->limits.mtu_cap;
>>> -	param.gid_cap   = (u16)dev->limits.gid_table_len;
>>> -	param.pkey_cap  = (u16)dev->limits.pkey_table_len;
>>> +	param.port_width	= dev->limits.port_width_cap;
>>> +	param.vl_cap	= dev->limits.vl_cap;
>>> +	param.mtu_cap	= dev->limits.mtu_cap;
>>> +	param.gid_cap	= (u16)dev->limits.gid_table_len;
>>> +	param.pkey_cap	= (u16)dev->limits.pkey_table_len;
>>> 
>>>  	err = mthca_INIT_IB(dev, &param, port, &status); 	if (err) @@ -
>>>  753,7 +754,7 @@ int mthca_modify_qp(struct ib_qp *ibqp, 	}
>>>  
>>>  	if (attr_mask & IB_QP_MAX_DEST_RD_ATOMIC &&
>>> -	    attr->max_dest_rd_atomic > 1 << dev->qp_table.rdb_shift) {
>>> +		attr->max_dest_rd_atomic > 1 << dev->qp_table.rdb_shift) {
>>>  		HCA_PRINT(TRACE_LEVEL_ERROR ,HCA_DBG_QP,("Max rdma_atomic as
>>>  responder %u too large (max %d)\n", 			  attr- max_dest_rd_atomic, 1
>>>  << dev- qp_table.rdb_shift)); 		goto out; @@ -768,9 +769,9 @@ int
>>>  mthca_modify_qp(struct ib_qp *ibqp, 	qp_context = &qp_param- context;
>>>  	RtlZeroMemory(qp_param, sizeof *qp_param);
>>> -	qp_context->flags      = cl_hton32((to_mthca_state(new_state) <<
>>> 28) |
>>> -					     (to_mthca_st(qp->transport) <<
>>> 16));
>>> -	qp_context->flags     |= cl_hton32(MTHCA_QP_BIT_DE);
>>> +	qp_context->flags	   = cl_hton32((to_mthca_state(new_state)
>>> << 28) |
>>> +						 (to_mthca_st(qp->transport)
>>> << 16));
>>> +	qp_context->flags	  |= cl_hton32(MTHCA_QP_BIT_DE);
>>>  	if (!(attr_mask & IB_QP_PATH_MIG_STATE)) 		qp_context- flags |=
>>>  cl_hton32(MTHCA_QP_PM_MIGRATED << 11); 	else { @@ -846,20 +847,20 @@
>>>  int mthca_modify_qp(struct ib_qp *ibqp, 	}
>>>  
>>>> 	if (attr_mask & IB_QP_AV) { -		qp_context- pri_path.g_mylmc     =
>>>> attr- ah_attr.src_path_bits & 0x7f;
>>> -		qp_context->pri_path.rlid        = cl_hton16(attr->ah_attr.dlid);
>>> -		//TODO: work around: set always full speed  - really, it's much
>>> more complicate
>>> +		qp_context->pri_path.g_mylmc	 = attr-
>>>> ah_attr.src_path_bits & 0x7f; +		qp_context->pri_path.rlid 		 =
>>>> cl_hton16(attr- ah_attr.dlid);
>>> +		//TODO: work around: set always full speed	- really, it's
>>> much more complicate
>>>  		qp_context->pri_path.static_rate = 0;
>>>  		if (attr->ah_attr.ah_flags & IB_AH_GRH) {
>>>  			qp_context->pri_path.g_mylmc |= 1 << 7;
>>>  			qp_context->pri_path.mgid_index = attr-
>>>  ah_attr.grh.sgid_index;
>>>  			qp_context->pri_path.hop_limit = attr-
>>>  ah_attr.grh.hop_limit;
>>>  			qp_context->pri_path.sl_tclass_flowlabel =
>>> -				cl_hton32((attr->ah_attr.sl << 28)                |
>>> -					    (attr->ah_attr.grh.traffic_class <<
>>> 20) |
>>> -					    (attr->ah_attr.grh.flow_label));
>>> +				cl_hton32((attr->ah_attr.sl << 28)
>>> 			  |
>>> +						(attr-
>>>  ah_attr.grh.traffic_class << 20) | + 	(attr- ah_attr.grh.flow_label));
>>>  			memcpy(qp_context-
>>> pri_path.rgid,
>>> -			       attr->ah_attr.grh.dgid.raw, 16);
>>> +				   attr->ah_attr.grh.dgid.raw, 16);
>>>  		} else { 			qp_context- pri_path.sl_tclass_flowlabel =
>>>  				cl_hton32(attr->ah_attr.sl << 28); @@ -875,7 +876,7 @@ int
>>>  mthca_modify_qp(struct ib_qp *ibqp, 	/* XXX alt_path */
>>>  
>>>  	/* leave rdd as 0 */
>>> -	qp_context->pd         = cl_hton32(to_mpd(ibqp->pd)->pd_num);
>>> +	qp_context->pd		   = cl_hton32(to_mpd(ibqp->pd)-
>>>  pd_num); 	/* leave wqe_base as 0 (we always create an MR based at 0
>>>  for WQs) */ 	qp_context->wqe_lkey   = cl_hton32(qp- mr.ibmr.lkey);
>>>  	qp_context->params1    = cl_hton32((unsigned long)( @@ -893,7 +894,7
>>>  @@ int mthca_modify_qp(struct ib_qp *ibqp, 		if (attr- max_rd_atomic)
>>>  { 			qp_context->params1 |=
>> 	cl_hton32(MTHCA_QP_BIT_SRE |
>>> -					    MTHCA_QP_BIT_SAE);
>>> +						MTHCA_QP_BIT_SAE);
>>>  			qp_context->params1 |= 	cl_hton32(fls(attr- max_rd_atomic - 1) <<
>>>  21); 		} @@ -920,7 +921,7 @@ int mthca_modify_qp(struct
> ib_qp
>> *ibqp, 	}
>>> 
>>>  	if (attr_mask & (IB_QP_ACCESS_FLAGS |
>>> IB_QP_MAX_DEST_RD_ATOMIC)) { -		qp_context- params2      |=
>>> get_hw_access_flags(qp, attr, attr_mask); +		qp_context- params2 	 |=
>>> get_hw_access_flags(qp, attr, attr_mask);
>>>  		qp_param->opt_param_mask |=
>>> cl_hton32(MTHCA_QP_OPTPAR_RWE |
>>> 
>>> 	MTHCA_QP_OPTPAR_RRE |
>>> 
>>> 	MTHCA_QP_OPTPAR_RAE);
>>> @@ -940,8 +941,8 @@ int mthca_modify_qp(struct ib_qp *ibqp,
>>> 
>>>  	qp_context->ra_buff_indx =
>>>  		cl_hton32(dev->qp_table.rdb_base +
>>> -			    ((qp->qpn & (dev->limits.num_qps - 1)) *
>>> MTHCA_RDB_ENTRY_SIZE <<
>>> -			     dev->qp_table.rdb_shift));
>>> +				((qp->qpn & (dev->limits.num_qps - 1)) *
>>> MTHCA_RDB_ENTRY_SIZE <<
>>> +				 dev->qp_table.rdb_shift));
>>> 
>>>  	qp_context->cqn_rcv = cl_hton32(to_mcq(ibqp->recv_cq)->cqn);
>>> @@ -955,25 +956,25 @@ int mthca_modify_qp(struct ib_qp *ibqp,
>>> 
>>>  	if (ibqp->srq)
>>>  		qp_context->srqn = cl_hton32(1 << 24 |
>>> -					       to_msrq(ibqp->srq)->srqn);
>>> +						   to_msrq(ibqp->srq)->srqn);
>>> 
>>>  	if (cur_state == IBQPS_RTS && new_state == IBQPS_SQD	&&
>>> -	    attr_mask & IB_QP_EN_SQD_ASYNC_NOTIFY		&&
>>> -	    attr->en_sqd_async_notify)
>>> +		attr_mask & IB_QP_EN_SQD_ASYNC_NOTIFY		&&
>>> +		attr->en_sqd_async_notify)
>>>  		sqd_event = (u32)(1 << 31);
>>>  
>>>  	err = mthca_MODIFY_QP(dev,
>>> state_table[cur_state][new_state].trans,
>>> -			      qp->qpn, 0, mailbox, sqd_event, &status);
>>> +				  qp->qpn, 0, mailbox, sqd_event, &status);
>>>  	if (err) {
>>>  		HCA_PRINT(TRACE_LEVEL_ERROR
>>> ,HCA_DBG_QP,("mthca_MODIFY_QP returned error (qp-num = 0x%x)
>>> returned status %02x "
>>> -			"cur_state  = %d  new_state = %d attr_mask = %d
>>> req_param = %d opt_param = %d\n",
>>> +			"cur_state	= %d  new_state = %d attr_mask =
>>> %d req_param = %d opt_param = %d\n",
>>>  			ibqp->qp_num, status, cur_state, new_state,
>>> -			attr_mask, req_param, opt_param));
>>> +			attr_mask, req_param, opt_param));
>>>  		goto out_mailbox;
>>>  	}
>>>  	if (status) {
>>>  		HCA_PRINT(TRACE_LEVEL_ERROR
>>> ,HCA_DBG_QP,("mthca_MODIFY_QP bad status(qp-num = 0x%x) returned
>>> status %02x " -			"cur_state  = %d  new_state = %d attr_mask = %d
>>> req_param = %d opt_param = %d\n", +			"cur_state	= %d  new_state = %d
>>> attr_mask = %d req_param = %d opt_param = %d\n",
>>>> 			ibqp->qp_num, status, cur_state, new_state, 	attr_mask, req_param,
>>>> opt_param)); 		err = -EINVAL; @@ -1011,10 +1012,10 @@ int
>>>> mthca_modify_qp(struct ib_qp *ibqp, 	 */ 	if (new_state ==
>>>> IBQPS_RESET && !qp->ibqp.ucontext) { 		mthca_cq_clean(dev,
>>>> to_mcq(qp->ibqp.send_cq)->cqn, qp- qpn,
>>> -			       qp->ibqp.srq ? to_msrq(qp->ibqp.srq) : NULL);
>>> +				   qp->ibqp.srq ? to_msrq(qp->ibqp.srq) :
>>> NULL);
>>>  		if (qp->ibqp.send_cq != qp->ibqp.recv_cq)
>>>  			mthca_cq_clean(dev, to_mcq(qp->ibqp.recv_cq)-
>>>  cqn, qp->qpn,
>>> -				       qp->ibqp.srq ? to_msrq(qp->ibqp.srq) :
>>> NULL);
>>> +					   qp->ibqp.srq ? to_msrq(qp-
>>>> ibqp.srq) : NULL);
>>>> 
>>>  		mthca_wq_init(&qp->sq); 		qp->sq.last = get_send_wqe(qp, qp->sq.max
>>>  - 1); @@ -1080,7 +1081,7 @@ static void mthca_adjust_qp_caps(struct
>>>  		(int)(max_data_size / sizeof (struct mthca_data_seg)));
>>>  	qp->rq.max_gs = min(dev->limits.max_sg,
>>>  		(int)((min(dev->limits.max_desc_sz, 1 << qp->rq.wqe_shift)
>>> - -		sizeof (struct mthca_next_seg)) / sizeof (struct
>>> mthca_data_seg))); +		sizeof (struct mthca_next_seg)) /
> sizeof (struct
>> mthca_data_seg)));
>>>  }
>>>  
>>>  /*
>>> @@ -1091,8 +1092,8 @@ static void mthca_adjust_qp_caps(struct
>>>   * queue)
>>>   */
>>>  static int mthca_alloc_wqe_buf(struct mthca_dev *dev,
>>> -			       struct mthca_pd *pd,
>>> -			       struct mthca_qp *qp)
>>> +				   struct mthca_pd *pd,
>>> +				   struct mthca_qp *qp)
>>>  { 	int size; 	int err = -ENOMEM; @@ -1105,7 +1106,7 @@ static int
>>>  mthca_alloc_wqe_buf(struct mt 		return -EINVAL;
>>>  
>>>  	for (qp->rq.wqe_shift = 6; 1 << qp->rq.wqe_shift < size;
>>> -	     qp->rq.wqe_shift++)
>>> +		 qp->rq.wqe_shift++)
>>>  		; /* nothing */
>>>  
>>>  	size = qp->sq.max_gs * sizeof (struct mthca_data_seg); @@ -1149,11
>>>  +1150,11 @@ static int mthca_alloc_wqe_buf(struct mt
>> 	return -EINVAL;
>>> 
>>>  	for (qp->sq.wqe_shift = 6; 1 << qp->sq.wqe_shift < size;
>>> -	     qp->sq.wqe_shift++)
>>> +		 qp->sq.wqe_shift++)
>>>  		; /* nothing */
>>>  
>>>  	qp->send_wqe_offset = ALIGN(qp->rq.max << qp->rq.wqe_shift,
>>> -				    1 << qp->sq.wqe_shift);
>>> +					1 << qp->sq.wqe_shift);
>>> 
>>>  	/* 	 * If this is a userspace QP, we don't actually have to @@ -
>>>  1172,7 +1173,7 @@ static int mthca_alloc_wqe_buf(struct mt
> 	goto
>> err_out;
>>> 
>>>  	err = mthca_buf_alloc(dev, size, MTHCA_MAX_DIRECT_QP_SIZE,
>>> -			      &qp->queue, &qp->is_direct, pd, 0, &qp->mr);
>>> +				  &qp->queue, &qp->is_direct, pd, 0, &qp-
>>>  mr); 	if (err) 		goto err_out; @@ -1185,16 +1186,16 @@ err_out: }
>>>  
>>>  static void mthca_free_wqe_buf(struct mthca_dev *dev,
>>> -			       struct mthca_qp *qp)
>>> +				   struct mthca_qp *qp)
>>>  {
>>>  	mthca_buf_free(dev, (int)(LONG_PTR)NEXT_PAGE_ALIGN(qp-
>>>  send_wqe_offset +
>>> -				       (qp->sq.max << qp->sq.wqe_shift)),
>>> -		       &qp->queue, qp->is_direct, &qp->mr);
>>> +					   (qp->sq.max << qp-
>>>  sq.wqe_shift)), +			   &qp->queue, qp->is_direct, &qp- mr);
>>>  	kfree(qp->wrid); }
>>>  
>>>  static int mthca_map_memfree(struct mthca_dev *dev,
>>> -			     struct mthca_qp *qp)
>>> +				 struct mthca_qp *qp)
>>>  { 	int ret; @@ -1207,10 +1208,10 @@ static int
>>>  mthca_map_memfree(struct mthc 		if (ret) 			goto
> err_qpc;
>>> - 		ret = mthca_table_get(dev, dev->qp_table.rdb_table,
>>> - 				      qp->qpn << dev->qp_table.rdb_shift);
>>> - 		if (ret)
>>> - 			goto err_eqpc;
>>> +		ret = mthca_table_get(dev, dev->qp_table.rdb_table,
>>> +					  qp->qpn << dev-
>>>> qp_table.rdb_shift);
>>> +		if (ret)
>>> +			goto err_eqpc;
>>> 
>>>  	} @@ -1235,7 +1236,7 @@ static void mthca_unmap_memfree(struct
> m }
>>> 
>>>  static int mthca_alloc_memfree(struct mthca_dev *dev,
>>> -			       struct mthca_qp *qp)
>>> +				   struct mthca_qp *qp)
>>>  { 	int ret = 0; @@ -1258,7 +1259,7 @@ static int
>>>  mthca_alloc_memfree(struct mt }
>>>  
>>>  static void mthca_free_memfree(struct mthca_dev *dev,
>>> -			       struct mthca_qp *qp)
>>> +				   struct mthca_qp *qp)
>>>  { 	if (mthca_is_memfree(dev)) { 		mthca_free_db(dev,
>>>  MTHCA_DB_TYPE_SQ, qp- sq.db_index); @@ -1280,10 +1281,10 @@ static
>>>  int mthca_alloc_qp_common(struct 	init_waitqueue_head(&qp->wait);
>>>  	KeInitializeMutex(&qp->mutex, 0);
>>> -	qp->state    	 = IBQPS_RESET;
>>> +	qp->state		 = IBQPS_RESET;
>>>  	qp->atomic_rd_en = 0;
>>> -	qp->resp_depth   = 0;
>>> -	qp->sq_policy    = send_policy;
>>> +	qp->resp_depth	 = 0;
>>> +	qp->sq_policy	 = send_policy;
>>>  	mthca_wq_init(&qp->sq); 	mthca_wq_init(&qp->rq); @@ - 1321,7 +1322,7
>>>  @@ static int mthca_alloc_qp_common(struct 		struct mthca_next_seg
>>>  *next; 		struct mthca_data_seg *scatter; 		int size = (sizeof (struct
>>>  mthca_next_seg) +
>>> -			    qp->rq.max_gs * sizeof (struct mthca_data_seg)) /
>>> 16;
>>> +				qp->rq.max_gs * sizeof (struct
>>> mthca_data_seg)) / 16;
>>> 
>>>  		for (i = 0; i < qp->rq.max; ++i) { 			next = get_recv_wqe(qp, i);
>>>  @@ -1330,15 +1331,15 @@ static int mthca_alloc_qp_common(struct
>>>  			next->ee_nds = cl_hton32(size);
>>>  
>>>  			for (scatter = (void *) (next + 1);
>>> -			     (void *) scatter < (void *) ((u8*)next + (u32)(1 <<
>>> qp->rq.wqe_shift)); -			     ++scatter) + 	 (void *) scatter < (void
>>> *) ((u8*)next + (u32)(1 << qp->rq.wqe_shift)); +
>> 	 ++scatter)
>>>  				scatter->lkey =
>> cl_hton32(MTHCA_INVAL_LKEY); 		}
>>> 
>>>  		for (i = 0; i < qp->sq.max; ++i) {
>>>  			next = get_send_wqe(qp, i);
>>>  			next->nda_op = cl_hton32((((i + 1) & (qp->sq.max -
>>> 1)) <<
>>> -						    qp->sq.wqe_shift) +
>>> +							qp->sq.wqe_shift) +
>>>> 						   qp->send_wqe_offset); 		} 	} @@ -1355,11 +1356,11 @@ static
>>>> int mthca_set_qp_size(struct mthc 	int max_data_size =
>>>> mthca_max_data_size(dev, qp, dev- limits.max_desc_sz);
>>>> 
>>>> 	/* Sanity check QP size before proceeding */
>>> -	if (cap->max_send_wr  	 > (u32)dev->limits.max_wqes || -
>>> 
>>> cap->max_recv_wr  	 > (u32)dev->limits.max_wqes || - cap- max_send_sge
>>> 	 > (u32)dev->limits.max_sg   || - cap->max_recv_sge 	 >
>>> (u32)dev->limits.max_sg   || - cap->max_inline_data >
>>> (u32)mthca_max_inline_data(max_data_size)) + 	if (cap- max_send_wr	 >
>>> (u32)dev->limits.max_wqes || +		cap- max_recv_wr	 >
>>> (u32)dev->limits.max_wqes || +		cap->max_send_sge	 >
>>> (u32)dev->limits.max_sg	 || +		cap->max_recv_sge	 >
>>> (u32)dev->limits.max_sg	 || +		cap->max_inline_data >
>>> (u32)mthca_max_inline_data(max_data_size))
>>>  		return -EINVAL;
>>>  
>>>  	/*
>>> @@ -1387,9 +1388,9 @@ static int mthca_set_qp_size(struct mthc
>>> 
>>>  	qp->rq.max_gs = cap->max_recv_sge;
>>>  	qp->sq.max_gs = MAX(cap->max_send_sge,
>>> -			      ALIGN(cap->max_inline_data +
>>> MTHCA_INLINE_HEADER_SIZE,
>>> -				    MTHCA_INLINE_CHUNK_SIZE) /
>>> -			      (int)sizeof (struct mthca_data_seg));
>>> +				  ALIGN(cap->max_inline_data +
>>> MTHCA_INLINE_HEADER_SIZE,
>>> +					MTHCA_INLINE_CHUNK_SIZE) /
>>> +				  (int)sizeof (struct mthca_data_seg));
>>> 
>>>  	return 0; } @@ -1422,7 +1423,7 @@ int mthca_alloc_qp(struct
>>>  mthca_dev *dev 		return -ENOMEM;
>>>  
>>>  	err = mthca_alloc_qp_common(dev, pd, send_cq, recv_cq,
>>> -				    send_policy, qp);
>>> +					send_policy, qp);
>>>  	if (err) { 		mthca_free(&dev->qp_table.alloc, qp->qpn); 		return
>>>  err; @@ -1437,14 +1438,14 @@ int mthca_alloc_qp(struct mthca_dev *dev
>>>  }
>>>  
>>>  int mthca_alloc_sqp(struct mthca_dev *dev,
>>> -		    struct mthca_pd *pd,
>>> -		    struct mthca_cq *send_cq,
>>> -		    struct mthca_cq *recv_cq,
>>> -		    enum ib_sig_type send_policy,
>>> -		    struct ib_qp_cap *cap,
>>> -		    int qpn,
>>> -		    int port,
>>> -		    struct mthca_sqp *sqp)
>>> +			struct mthca_pd *pd,
>>> +			struct mthca_cq *send_cq,
>>> +			struct mthca_cq *recv_cq,
>>> +			enum ib_sig_type send_policy,
>>> +			struct ib_qp_cap *cap,
>>> +			int qpn,
>>> +			int port,
>>> +			struct mthca_sqp *sqp)
>>>  { 	u32 mqpn = qpn * 2 + dev->qp_table.sqp_start + port - 1; 	int err;
>>>  @@ -1474,11 +1475,11 @@ int mthca_alloc_sqp(struct mthca_dev *de
>>>  		goto err_out;
>>>  
>>>  	sqp->port = port;
>>> -	sqp->qp.qpn       = mqpn;
>>> +	sqp->qp.qpn 	  = mqpn;
>>>  	sqp->qp.transport = MLX;
>>>  
>>>  	err = mthca_alloc_qp_common(dev, pd, send_cq, recv_cq,
>>> -				    send_policy, &sqp->qp);
>>> +					send_policy, &sqp->qp);
>>>  	if (err) 		goto err_out_free; @@ -1558,10 +1559,10 @@ void
>>>  mthca_free_qp(struct mthca_dev *dev 	 */ 	if (!qp- ibqp.ucontext) {
>>>  		mthca_cq_clean(dev, to_mcq(qp->ibqp.send_cq)->cqn, qp-
> qpn,
>>> -			       qp->ibqp.srq ? to_msrq(qp->ibqp.srq) : NULL);
>>> +				   qp->ibqp.srq ? to_msrq(qp->ibqp.srq) :
>>> NULL);
>>>  		if (qp->ibqp.send_cq != qp->ibqp.recv_cq)
>>>  			mthca_cq_clean(dev, to_mcq(qp->ibqp.recv_cq)-
>>>  cqn, qp->qpn,
>>> -				       qp->ibqp.srq ? to_msrq(qp->ibqp.srq) :
>>> NULL);
>>> +					   qp->ibqp.srq ? to_msrq(qp-
>>>> ibqp.srq) : NULL);
>>>> 
>>>  		mthca_free_memfree(dev, qp); 	mthca_free_wqe_buf(dev, qp); @@
>>>  -1587,12 +1588,12 @@ static enum mthca_wr_opcode conv_ibal_wr 		case
>>>  WR_SEND: 			opcode = (wr->send_opt &
>>> IB_SEND_OPT_IMMEDIATE) ? MTHCA_OPCODE_SEND_IMM :
>>> MTHCA_OPCODE_SEND;
>>>  			break;
>>> -		case WR_RDMA_WRITE:
>>> +		case WR_RDMA_WRITE:
>>>  			opcode = (wr->send_opt &
>>> IB_SEND_OPT_IMMEDIATE) ? MTHCA_OPCODE_RDMA_WRITE_IMM :
>>> MTHCA_OPCODE_RDMA_WRITE;
>>>  			break;
>>> -		case WR_RDMA_READ: 		opcode =
>>> MTHCA_OPCODE_RDMA_READ; break;
>>> -		case WR_COMPARE_SWAP: 		opcode =
>>> MTHCA_OPCODE_ATOMIC_CS; break;
>>> -		case WR_FETCH_ADD: 			opcode =
>>> MTHCA_OPCODE_ATOMIC_FA; break;
>>> +		case WR_RDMA_READ:		opcode =
>>> MTHCA_OPCODE_RDMA_READ; break;
>>> +		case WR_COMPARE_SWAP:		opcode =
>>> MTHCA_OPCODE_ATOMIC_CS; break;
>>> +		case WR_FETCH_ADD:			opcode =
>>> MTHCA_OPCODE_ATOMIC_FA; break;
>>>  		default: 	opcode = MTHCA_OPCODE_INVALID;break;
>> 	} 	return opcode;
>>> @@ -1600,9 +1601,9 @@ static enum mthca_wr_opcode conv_ibal_wr
>>> 
>>>  /* Create UD header for an MLX send and build a data segment for it */
>>>  static int build_mlx_header(struct mthca_dev *dev, struct mthca_sqp
>> *sqp,
>>> -			    int ind, struct _ib_send_wr *wr,
>>> -			    struct mthca_mlx_seg *mlx,
>>> -			    struct mthca_data_seg *data)
>>> +				int ind, struct _ib_send_wr *wr,
>>> +				struct mthca_mlx_seg *mlx,
>>> +				struct mthca_data_seg *data)
>>>  {
>>>  	enum ib_wr_opcode opcode = conv_ibal_wr_opcode(wr);
>>>  	int header_size;
>>> @@ -1618,7 +1619,7 @@ static int build_mlx_header(struct mthca
>>> 
>>>  	ib_ud_header_init(256, /* assume a MAD */
>>>  		mthca_ah_grh_present(to_mah((struct ib_ah *)wr-
>>>  dgrm.ud.h_av)),
>>> -	  	&sqp->ud_header);
>>> +		&sqp->ud_header);
>>> 
>>>  	err = mthca_read_ah(dev, to_mah((struct ib_ah *)wr- dgrm.ud.h_av),
>>>  &sqp->ud_header); 	if (err){ @@ -1662,7 +1663,7 @@ static int
>>>  build_mlx_header(struct mthca 	sqp- ud_header.bth.destination_qpn =
>>>  wr->dgrm.ud.remote_qp; 	sqp->ud_header.bth.psn =
>>>  cl_hton32((sqp->send_psn++) & ((1 << 24) - 1));
>>>  	sqp->ud_header.deth.qkey = wr->dgrm.ud.remote_qkey &
>>> 0x00000080 ?
>>> -					       cl_hton32(sqp->qkey) : wr-
>>>> dgrm.ud.remote_qkey; + cl_hton32(sqp->qkey) : wr-
>>>> dgrm.ud.remote_qkey; 	sqp->ud_header.deth.source_qpn = cl_hton32(sqp-
>>>> qp.ibqp.qp_num);
>>>> 
>>>  	header_size = ib_ud_header_pack(&sqp->ud_header, @@ -1670,15
>>>  +1671,15 @@ static int build_mlx_header(struct mthca 		ind *
>>>  MTHCA_UD_HEADER_SIZE);
>>>> 
>>>> 	data->byte_count = cl_hton32(header_size); -	data->lkey       =
>>>> cl_hton32(to_mpd(sqp->qp.ibqp.pd)- ntmr.ibmr.lkey);
>>> -	data->addr       = CPU_2_BE64(sqp->sg.dma_address +
>>> -				       ind * MTHCA_UD_HEADER_SIZE);
>>> +	data->lkey		 = cl_hton32(to_mpd(sqp->qp.ibqp.pd)-
>>>> ntmr.ibmr.lkey);
>>> +	data->addr		 = CPU_2_BE64(sqp->sg.dma_address +
>>> +					   ind * MTHCA_UD_HEADER_SIZE);
>>> 
>>>  	return 0;
>>>  }
>>>  
>>>  static inline int mthca_wq_overflow(struct mthca_wq *wq, int nreq,
>>> -				    struct ib_cq *ib_cq)
>>> +					struct ib_cq *ib_cq)
>>>  { 	unsigned cur; 	struct mthca_cq *cq; @@ -1715,7 +1716,7 @@ int
>>>  mthca_tavor_post_send(struct ib_qp * 	SPIN_LOCK_PREP(lh);
>>>  
>>>  	spin_lock_irqsave(&qp->sq.lock, &lh);
>>> -
>>> +
>>>  	/* XXX check that state is OK to post send */
>>>  
>>>  	ind = qp->sq.next_ind; @@ -1746,7 +1747,7 @@ int
>>>  mthca_tavor_post_send(struct ib_qp * cl_hton32(MTHCA_NEXT_SOLICIT) :
>>>  0)   | 			cl_hton32(1); 		if (opcode ==
>> MTHCA_OPCODE_SEND_IMM||
>>> -		    opcode == MTHCA_OPCODE_RDMA_WRITE_IMM)
>>> +			opcode == MTHCA_OPCODE_RDMA_WRITE_IMM)
>>>  			((struct mthca_next_seg *) wqe)->imm = wr-
>>>  immediate_data;
>>>  
>>>  		wqe += sizeof (struct mthca_next_seg);
>>> @@ -1834,8 +1835,8 @@ int mthca_tavor_post_send(struct ib_qp *
>>> 
>>>  		case MLX:
>>>  			err = build_mlx_header(dev, to_msqp(qp), ind, wr,
>>> -					       (void*)(wqe - sizeof (struct
>>> mthca_next_seg)),
>>> -					       (void*)wqe);
>>> +						   (void*)(wqe - sizeof (struct
>>> mthca_next_seg)),
>>> +						   (void*)wqe);
>>>  			if (err) { 				if (bad_wr)
>> 					*bad_wr = wr; @@ -1872,7 +1873,7
>> @@
>>>  int mthca_tavor_post_send(struct ib_qp *
>> 	}
>>> 
>>>  					memcpy(wqe, (void *) (ULONG_PTR)
>>> sge->vaddr,
>>> -					       sge->length);
>>> +						   sge->length);
>>>  					wqe += sge->length; 		} @@ -1880,20 +1881,20 @@ int
>>>  mthca_tavor_post_send(struct ib_qp * 				size += align(s + sizeof
>>>  *seg, 16) / 16; 			} 		} else {
>>> -
>>> -    		for (i = 0; i < (int)wr->num_ds; ++i) {
>>> -    			((struct mthca_data_seg *) wqe)->byte_count =
>>> -    				cl_hton32(wr->ds_array[i].length);
>>> -    			((struct mthca_data_seg *) wqe)->lkey =
>>> -    				cl_hton32(wr->ds_array[i].lkey);
>>> -    			((struct mthca_data_seg *) wqe)->addr =
>>> -    				cl_hton64(wr->ds_array[i].vaddr);
>>> -    			wqe += sizeof (struct mthca_data_seg);
>>> -    			size += sizeof (struct mthca_data_seg) / 16;
>>> -    			HCA_PRINT(TRACE_LEVEL_VERBOSE ,HCA_DBG_QP
>>> ,("SQ %06x [%02x]  lkey 0x%08x vaddr 0x%I64x 0x%x\n",qp->qpn,i,
>>> -    				(wr->ds_array[i].lkey),(wr-
>>>> ds_array[i].vaddr),wr->ds_array[i].length));
>>> -    		}
>>> -    	}
>>> +
>>> +			for (i = 0; i < (int)wr->num_ds; ++i) {
>>> +				((struct mthca_data_seg *) wqe)-
>>>> byte_count =
>>> +					cl_hton32(wr->ds_array[i].length); + 				((struct mthca_data_seg
>>> *) wqe)->lkey = +					cl_hton32(wr- ds_array[i].lkey); +				((struct
>>> mthca_data_seg *) wqe)->addr = + 	cl_hton64(wr->ds_array[i].vaddr);
>>> +				wqe += sizeof (struct mthca_data_seg); + 				size += sizeof
>>> (struct mthca_data_seg) / 16; + 	HCA_PRINT(TRACE_LEVEL_VERBOSE
>>> ,HCA_DBG_QP ,("SQ %06x [%02x]  lkey 0x%08x vaddr 0x%I64x
> 0x%x\n",qp-
>>>> qpn,i, +					(wr->ds_array[i].lkey),(wr-
>>>> ds_array[i].vaddr),wr->ds_array[i].length));
>>> +			}
>>> +		}
>>> 
>>>  		/* Add one more inline data segment for ICRC */ 	if (qp- transport
>>>  == MLX) { @@ -1946,19 +1947,19 @@ out: 		wmb();
>>>  
>>>  		mthca_write64(doorbell,
>>> -			      dev->kar + MTHCA_SEND_DOORBELL,
>>> -			      MTHCA_GET_DOORBELL_LOCK(&dev-
>>>> doorbell_lock));
>>> +				  dev->kar + MTHCA_SEND_DOORBELL,
>>> +				  MTHCA_GET_DOORBELL_LOCK(&dev-
>>>> doorbell_lock));
>>>> 	}
>>>> 
>>>> 	qp->sq.next_ind = ind;
>>>> 	qp->sq.head    += nreq;
>>> -    spin_unlock_irqrestore(&lh);
>>> +	spin_unlock_irqrestore(&lh);
>>>  	return err;
>>>  }
>>>  
>>>  int mthca_tavor_post_recv(struct ib_qp *ibqp, struct _ib_recv_wr *wr,
>>> -			     struct _ib_recv_wr **bad_wr)
>>> +				 struct _ib_recv_wr **bad_wr)
>>>  { 	struct mthca_dev *dev = to_mdev(ibqp->device); 	struct mthca_qp
>>>  *qp = to_mqp(ibqp); @@ -1989,7 +1990,7 @@ int
>>>  mthca_tavor_post_recv(struct ib_qp * 			wmb();
>>>  
>>>  			mthca_write64(doorbell, dev->kar +
>>> MTHCA_RECV_DOORBELL,
>>> -		      MTHCA_GET_DOORBELL_LOCK(&dev->doorbell_lock));
>>> +			  MTHCA_GET_DOORBELL_LOCK(&dev-
>>>> doorbell_lock));
>>>> 
>>>  			qp->rq.head += MTHCA_TAVOR_MAX_WQES_PER_RECV_DB; 			size0 = 0; @@
>>>  -2034,7 +2035,7 @@ int mthca_tavor_post_recv(struct ib_qp *
>>>  				cl_hton64(wr->ds_array[i].vaddr); 	wqe += sizeof (struct
>>>  mthca_data_seg); 			size += sizeof (struct
>> mthca_data_seg) / 16;
>>> -//			HCA_PRINT(TRACE_LEVEL_ERROR  ,HCA_DBG_QP ,("RQ %06x [%02x]  lkey
>>> 0x%08x vaddr 0x%I64x 0x %x 0x%08x\n",i,qp- qpn,
>>> +//			HCA_PRINT(TRACE_LEVEL_ERROR  ,HCA_DBG_QP ,("RQ %06x [%02x]	lkey
>>> 0x%08x vaddr 0x%I64x 0x %x 0x%08x\n",i,qp-
>>>  qpn, //				(wr->ds_array[i].lkey),(wr-
>>>  ds_array[i].vaddr),wr->ds_array[i].length, wr->wr_id)); 		} @@
>>>  -2064,7 +2065,7 @@ out: 		wmb();
>>>  
>>>  		mthca_write64(doorbell, dev->kar +
>>> MTHCA_RECV_DOORBELL,
>>> -	      MTHCA_GET_DOORBELL_LOCK(&dev->doorbell_lock));
>>> +		  MTHCA_GET_DOORBELL_LOCK(&dev->doorbell_lock));
>>>  	}
>>>  
>>>  	qp->rq.next_ind = ind;
>>> @@ -2228,7 +2229,7 @@ int mthca_arbel_post_send(struct ib_qp *
>>> 
>>>  		case UD:
>>>  			memcpy(((struct mthca_arbel_ud_seg *) wqe)->av,
>>> -			       to_mah((struct ib_ah *)wr->dgrm.ud.h_av)->av,
>>> MTHCA_AV_SIZE); +				   to_mah((struct
> ib_ah *)wr-
>>>> dgrm.ud.h_av)->av, MTHCA_AV_SIZE);
>>>> 			((struct mthca_arbel_ud_seg *) wqe)->dqpn = wr-
>>>> dgrm.ud.remote_qp;
>>>> 			((struct mthca_arbel_ud_seg *) wqe)->qkey = wr-
>>>> dgrm.ud.remote_qkey;
>>> 
>>> @@ -2238,8 +2239,8 @@ int mthca_arbel_post_send(struct ib_qp *
>>> 
>>>  		case MLX:
>>>  			err = build_mlx_header(dev, to_msqp(qp), ind, wr,
>>> -					       (void*)(wqe - sizeof (struct
>>> mthca_next_seg)),
>>> -					       (void*)wqe);
>>> +						   (void*)(wqe - sizeof (struct
>>> mthca_next_seg)),
>>> +						   (void*)wqe);
>>>  			if (err) { 				if (bad_wr)
>> 					*bad_wr = wr; @@ -2257,7 +2258,7
>> @@
>>>  int mthca_arbel_post_send(struct ib_qp * 	*bad_wr = wr; 	goto out; 		}
>>> -        if (wr->send_opt & IB_SEND_OPT_INLINE) {
>>> +		if (wr->send_opt & IB_SEND_OPT_INLINE) {
>>>  			if (wr->num_ds) { 				struct mthca_inline_seg *seg = (struct
>>>  mthca_inline_seg *)wqe; 				uint32_t s = 0; @@ - 2276,7 +2277,7 @@
>>>  int mthca_arbel_post_send(struct ib_qp *
>> 	}
>>> 
>>>  					memcpy(wqe, (void *) (uintptr_t)
>>> sge->vaddr,
>>> -					       sge->length);
>>> +						   sge->length);
>>>  					wqe += sge->length; 		} @@ -2284,17 +2285,17 @@ int
>>>  mthca_arbel_post_send(struct ib_qp * 				size += align(s + sizeof
>>>  *seg, 16) / 16; 			} 		} else {
>>> -    		for (i = 0; i < (int)wr->num_ds; ++i) {
>>> -    			((struct mthca_data_seg *) wqe)->byte_count =
>>> -    				cl_hton32(wr->ds_array[i].length);
>>> -    			((struct mthca_data_seg *) wqe)->lkey =
>>> -    				cl_hton32(wr->ds_array[i].lkey);
>>> -    			((struct mthca_data_seg *) wqe)->addr =
>>> -    				cl_hton64(wr->ds_array[i].vaddr);
>>> -    			wqe += sizeof (struct mthca_data_seg);
>>> -    			size += sizeof (struct mthca_data_seg) / 16;
>>> -    		}
>>> -    	}
>>> +			for (i = 0; i < (int)wr->num_ds; ++i) {
>>> +				((struct mthca_data_seg *) wqe)-
>>>> byte_count =
>>> +					cl_hton32(wr->ds_array[i].length); + 				((struct mthca_data_seg
>>> *) wqe)->lkey = +					cl_hton32(wr- ds_array[i].lkey); +				((struct
>>> mthca_data_seg *) wqe)->addr = + 	cl_hton64(wr->ds_array[i].vaddr);
>>> +				wqe += sizeof (struct mthca_data_seg); + 				size += sizeof
>>> (struct mthca_data_seg) / 16; +			} +		}
>>> 
>>>  		/* Add one more inline data segment for ICRC */ 	if (qp- transport
>>>  == MLX) { @@ -2354,8 +2355,8 @@ out: 		 */ 	wmb();
>>>  		mthca_write64(doorbell,
>>> -			      dev->kar + MTHCA_SEND_DOORBELL,
>>> -			      MTHCA_GET_DOORBELL_LOCK(&dev-
>>>> doorbell_lock));
>>> +				  dev->kar + MTHCA_SEND_DOORBELL,
>>> +				  MTHCA_GET_DOORBELL_LOCK(&dev-
>>>> doorbell_lock));
>>>> 	}
>>>> 
>>>> 	spin_unlock_irqrestore(&lh); @@ -2363,7 +2364,7 @@ out: }
>>>> 
>>>> int mthca_arbel_post_recv(struct ib_qp *ibqp, struct _ib_recv_wr *wr,
>>> -			     struct _ib_recv_wr **bad_wr)
>>> +				 struct _ib_recv_wr **bad_wr)
>>>  { 	struct mthca_qp *qp = to_mqp(ibqp); 	int err = 0; @@ -2373,7
>>>  +2374,7 @@ int mthca_arbel_post_recv(struct ib_qp * 	u8 *wqe;
>>>  	SPIN_LOCK_PREP(lh);
>>> - 	spin_lock_irqsave(&qp->rq.lock, &lh);
>>> +	spin_lock_irqsave(&qp->rq.lock, &lh);
>>> 
>>>  	/* XXX check that state is OK to post receive */ @@ -2444,7 +2445,7
>>>  @@ out: }
>>>  
>>>  void mthca_free_err_wqe(struct mthca_dev *dev, struct mthca_qp *qp,
>> int
>>> is_send,
>>> -		       int index, int *dbd, __be32 *new_wqe)
>>> +			   int index, int *dbd, __be32 *new_wqe)
>>>  { 	struct mthca_next_seg *next; @@ -2487,15 +2488,15 @@ int
>>>  mthca_init_qp_table(struct mthca_dev 	 */ 	dev- qp_table.sqp_start =
>>>  (dev->limits.reserved_qps + 1) & ~1UL; 	err =
>>>  mthca_alloc_init(&dev->qp_table.alloc,
>>> -			       dev->limits.num_qps,
>>> -			       (1 << 24) - 1,
>>> -			       dev->qp_table.sqp_start +
>>> -			       MTHCA_MAX_PORTS * 2);
>>> +				   dev->limits.num_qps,
>>> +				   (1 << 24) - 1,
>>> +				   dev->qp_table.sqp_start +
>>> +				   MTHCA_MAX_PORTS * 2);
>>>  	if (err)
>>>  		return err;
>>>  
>>>  	err = mthca_array_init(&dev->qp_table.qp,
>>> -			       dev->limits.num_qps);
>>> +				   dev->limits.num_qps);
>>>  	if (err) {
>>>  		mthca_alloc_cleanup(&dev->qp_table.alloc);
>>>  		return err;
>>> @@ -2503,8 +2504,8 @@ int mthca_init_qp_table(struct mthca_dev
>>> 
>>>  	for (i = 0; i < 2; ++i) {
>>>  		err = mthca_CONF_SPECIAL_QP(dev, i ? IB_QPT_QP1 :
>>> IB_QPT_QP0,
>>> -					    dev->qp_table.sqp_start + i * 2,
>>> -					    &status);
>>> +						dev->qp_table.sqp_start + i *
>>> 2,
>>> +						&status);
>>>  		if (err)
>>>  			goto err_out;
>>>  		if (status) {
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/hw/mthca/mt_utils.h branches\mlx4/hw/mthca/mt_utils.h --
>>> - trunk/hw/mthca/mt_utils.h	2011-09-13 09:16:20.549343600 -0700 +++
>>> branches\mlx4/hw/mthca/mt_utils.h	2011-10-10 16:59:49.555734600 -0700
>>> @@ -111,7 +111,7 @@ static __inline int _ffs(const unsigned
>>>  }
>>> -#define ffs(val)	_ffs((const unsigned long *)&val)
>>> +#define ffs(val)	_ffs((const unsigned long *)&(val))
>>> 
>>>  /** * _ffz_raw - find the first zero bit in a word @@ -202,21 +202,22
>>>  @@ static __inline int find_first_zero_bit( static __inline int
>>>  find_next_zero_bit(const unsigned long *addr, int bits_size, int
>>>  offset) { 	int res;
>>> -	int ix = offset % BITS_PER_LONG;
>>> -	int w_offset = offset / BITS_PER_LONG;
>>> +	int ix = offset & 31;
>>> +	int set = offset & ~31;
>>> +	const unsigned long *p = addr + (set >> 5);
>>> 
>>>  	// search in the first word while we are in the middle
>>>  	if (ix) {
>>> -		res = _ffz_raw(addr + w_offset, ix);
>>> +		res = _ffz_raw(p, ix);
>>>  		if (res)
>>> -			return res - 1;
>>> -		++addr;
>>> -		bits_size -= BITS_PER_LONG;
>>> -		ix = BITS_PER_LONG;
>>> +			return set + res - 1;
>>> +		++p;
>>> +		set += BITS_PER_LONG;
>>>  	}
>>> -	res = find_first_zero_bit( addr, bits_size );
>>> -	return res + ix;
>>> +	// search the rest of the bitmap
>>> +	res = find_first_zero_bit(p, bits_size - (unsigned)(32 * (p - addr)));
>>> +	return res + set;
>>>  }
>>>  
>>>  void fill_bit_tbls();
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/hw/mthca/user/mlnx_ual_main.c
>>> branches\mlx4/hw/mthca/user/mlnx_ual_main.c ---
>>> trunk/hw/mthca/user/mlnx_ual_main.c	2011-09-13 09:16:20.315320200
>>> -0700 +++ branches\mlx4/hw/mthca/user/mlnx_ual_main.c	2011-10-10
>>> 16:59:49.172696300 -0700 @@ -44,13 +44,6 @@ uint32_t	mlnx_dbg_lvl = 0;
>>> // MLNX_TRACE
>>> 
>>>  static void uvp_init();
>>> -extern BOOL APIENTRY
>>> -_DllMainCRTStartupForGS(
>>> -	IN				HINSTANCE
>>> 		h_module,
>>> -	IN				DWORD
>>> 			ul_reason_for_call,
>>> -	IN				LPVOID
>>> 		lp_reserved );
>>> -
>>> -
>>>  BOOL APIENTRY
>>>  DllMain(
>>>  	IN				HINSTANCE
>>> 		h_module,
>>> @@ -61,14 +54,8 @@ DllMain(
>>>  	{
>>>  	case DLL_PROCESS_ATTACH:
>>>  #if defined(EVENT_TRACING)
>>> -		WPP_INIT_TRACING(L"mthcau.dll");
>>> +    WPP_INIT_TRACING(L"mthcau.dll");
>>>  #endif
>>> -		if( !_DllMainCRTStartupForGS(
>>> -			h_module, ul_reason_for_call, lp_reserved ) )
>>> -		{
>>> -			return FALSE;
>>> -		}
>>> -
>>>  		fill_bit_tbls(); 		uvp_init(); 		break; @@ - 86,8 +73,7 @@ DllMain(
>>>  #endif
>>>  
>>>  	default:
>>> -		return _DllMainCRTStartupForGS(
>>> -			h_module, ul_reason_for_call, lp_reserved );
>>> +		return TRUE;
>>>  	}
>>>  	return TRUE;
>>>  }
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/hw/mthca/user/mlnx_uvp.h
>>> branches\mlx4/hw/mthca/user/mlnx_uvp.h ---
>>> trunk/hw/mthca/user/mlnx_uvp.h	2011-09-13 09:16:20.144303100 -0700 +++
>>> branches\mlx4/hw/mthca/user/mlnx_uvp.h	2011-10-10 16:59:49.049684000
>>> -0700 @@ -131,7 +131,7 @@ struct mthca_cq {
>>>  	int                arm_db_index;
>>>  	uint32_t          *arm_db;
>>>  	int                u_arm_db_index;
>>> -	uint32_t          *p_u_arm_sn;
>>> +	volatile uint32_t *p_u_arm_sn;
>>>  };
>>>  
>>>  struct mthca_srq { @@ -257,7 +257,7 @@ static inline int
>>>  mthca_is_memfree(struc }
>>>  
>>>  int mthca_alloc_db(struct mthca_db_table *db_tab, enum
>> mthca_db_type
>>> type,
>>> -			  uint32_t **db);
>>> +			  volatile uint32_t **db);
>>>  void mthca_set_db_qn(uint32_t *db, enum mthca_db_type type, uint32_t
>>>  qn); void mthca_free_db(struct mthca_db_table *db_tab, enum
>>>  mthca_db_type type, int db_index); struct mthca_db_table
>>>  *mthca_alloc_db_tab(int uarc_size);
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/hw/mthca/user/mlnx_uvp.rc
>>> branches\mlx4/hw/mthca/user/mlnx_uvp.rc ---
>>> trunk/hw/mthca/user/mlnx_uvp.rc	2011-09-13 09:16:20.414330100 -0700
>>> +++ branches\mlx4/hw/mthca/user/mlnx_uvp.rc	2011-10-10
>>> 16:59:49.198698900 -0700 @@ -37,8 +37,8 @@
>>> 
>>>  #ifdef DBG
>>>  #define VER_FILEDESCRIPTION_STR     "HCA User Mode Verb Provider
>>> (checked)"
>>> -#define VER_INTERNALNAME_STR		"mthcaud.dll"
>>> -#define VER_ORIGINALFILENAME_STR	"mthcaud.dll"
>>> +#define VER_INTERNALNAME_STR		"mthcau.dll"
>>> +#define VER_ORIGINALFILENAME_STR	"mthcau.dll"
>>>  #else
>>>  #define VER_FILEDESCRIPTION_STR     "HCA User Mode Verb Provider"
>>>  #define VER_INTERNALNAME_STR		"mthcau.dll"
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/hw/mthca/user/mlnx_uvp_memfree.c
>>> branches\mlx4/hw/mthca/user/mlnx_uvp_memfree.c ---
>>> trunk/hw/mthca/user/mlnx_uvp_memfree.c	2011-09-13 09:16:20.088297500
>>> -0700 +++ branches\mlx4/hw/mthca/user/mlnx_uvp_memfree.c	2011- 10-10
>>> 16:59:49.368715900 -0700 @@ -52,7 +52,7 @@ struct mthca_db_table {
>>>  };
>>>  
>>>  int mthca_alloc_db(struct mthca_db_table *db_tab, enum
>> mthca_db_type
>>> type,
>>> -		   uint32_t **db)
>>> +		  volatile uint32_t **db)
>>>  { 	int i, j, k; 	int group, start, end, dir; @@ -128,7 +128,7 @@
>>>  found: 		j = MTHCA_DB_REC_PER_PAGE - 1 - j;
>>>  
>>>  	ret = i * MTHCA_DB_REC_PER_PAGE + j;
>>> -	*db = (uint32_t *) &db_tab->page[i].db_rec[j];
>>> +	*db = (volatile uint32_t *) &db_tab->page[i].db_rec[j];
>>> 
>>>  out:
>>>  	ReleaseMutex( db_tab->mutex );
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/hw/mthca/user/SOURCES branches\mlx4/hw/mthca/user/SOURCES
>>> --- trunk/hw/mthca/user/SOURCES	2011-09-13 09:16:19.951283800 -0700
>>> +++ branches\mlx4/hw/mthca/user/SOURCES	2011-10-10 16:59:49.282707300
>>> -0700 @@ -2,15 +2,20 @@ TRUNK=..\..\..
>>> 
>>>  TARGETNAME=mthcau + TARGETPATH=$(TRUNK)\bin\user\obj$(BUILD_ALT_DIR)
>>>  TARGETTYPE=DYNLINK
>>> -
>>> +!if $(_NT_TOOLS_VERSION) == 0x700
>>> +# DDK
>>> +DLLDEF=$O\mlnx_uvp.def
>>> +!else
>>>  # WDK
>>>  DLLDEF=$(OBJ_PATH)\$O\mlnx_uvp.def
>>> -
>>> +!endif
>>>  #USE_NTDLL=1
>>>  USE_MSVCRT=1
>>>  DLLENTRY=DllMain
>>> +NTTARGETFILES=Custom_target
>>> 
>>>  !if $(FREEBUILD) ENABLE_EVENT_TRACING=1 @@ -56,8 +61,13 @@
>>>  TARGETLIBS=\ 	$(SDK_LIB_PATH)\user32.lib \
>>>  	$(SDK_LIB_PATH)\kernel32.lib \
> 	$(SDK_LIB_PATH)\Advapi32.lib \
>>> -	$(TARGETPATH)\*\complib.lib \
>>> -	$(TARGETPATH)\*\ibal.lib
>>> +        $(TARGETPATH)\*\complib.lib \
>>> +        $(TARGETPATH)\*\ibal.lib
>>> +
>>> +
>>> +!if !$(FREEBUILD)
>>> +C_DEFINES=$(C_DEFINES) -D_DEBUG -DDEBUG -DDBG
>>> +!endif
>>> 
>>>  #LINKER_FLAGS=/MAP /MAPINFO:LINES
>>> SH: That's a huge number of changes to a driver that does NOT support
>>> IBoE. Are all of those really needed?
>>> 
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/inc/complib/cl_memory.h
>>> branches\mlx4/inc/complib/cl_memory.h ---
>>> trunk/inc/complib/cl_memory.h	2011-09-13 09:15:47.799068900 -0700 +++
>>> branches\mlx4/inc/complib/cl_memory.h	2011-10-10 16:59:17.406520000
>>> -0700 @@ -189,6 +189,55 @@ __cl_malloc_trk(
>>>  *	Memory Management, __cl_malloc_ntrk, __cl_zalloc_trk, __cl_free_trk
>>>  **********/
>>> +/****i* Component Library: Memory Management/__cl_malloc_trk_ex +*
>>> NAME +*	__cl_malloc_trk_ex +* +* DESCRIPTION +*	The __cl_malloc_trk_ex
>>> function allocates and tracks a block of memory +*	initialized to
>>> zero. +* +* SYNOPSIS +*/ +CL_EXPORT void* CL_API +__cl_malloc_trk_ex(
>>> +	IN	const char* const	p_file_name, +	IN	const int32_t 		line_num,
>>> +	IN	const size_t		size, +	IN	const boolean_t 		pageable, +	IN	const
>>> char*			tag ); +/* +* PARAMETERS +*	p_file_name +* 		[in] Name of the
>>> source file initiating the allocation. +* +*	line_num +*		[in] Line
>>> number in the specified file where the allocation is +* 	initiated +*
>>> +*	size +*		[in] Size of the requested allocation. +* +* 	pageable
>>> +*		[in] On operating systems that support pageable vs. non pageable
>>> +*		memory in the kernel, set to TRUE to allocate memory from paged
>>> pool. +*     tag +*		[in] An optional ASCII string, describing the
>>> name of memory block (or the owner) +* +* RETURN VALUES +*	Pointer to
>>> allocated memory if successful. +* +*	NULL otherwise. +* +* NOTES
>>> +*	Allocated memory follows alignment rules specific to the different
>>> +*	environments. +*	This function should not be called directly. The
>>> cl_zalloc macro will +*	redirect users to this function when memory
>>> tracking is enabled. +* +* SEE ALSO +*	Memory Management,
>>> __cl_zalloc_ntrk, __cl_malloc_trk, __cl_free_trk +**********/
>>> 
>>>  /****i* Component Library: Memory Management/__cl_zalloc_trk * NAME
>>>  @@ -237,6 +286,57 @@ __cl_zalloc_trk( *	Memory Management,
>>>  __cl_zalloc_ntrk, __cl_malloc_trk, __cl_free_trk **********/
>>> +/****i* Component Library: Memory Management/__cl_zalloc_trk_ex +*
>>> NAME +*	__cl_zalloc_trk_ex +* +* DESCRIPTION +*	The __cl_zalloc_trk_ex
>>> function allocates and tracks a block of memory +*	initialized to
>>> zero. +* +* SYNOPSIS +*/ +CL_EXPORT void* CL_API +__cl_zalloc_trk_ex(
>>> +	IN	const char* const	p_file_name, +	IN	const int32_t 		line_num,
>>> +	IN	const size_t		size, +	IN	const boolean_t 		pageable, +	IN	const
>>> char*			tag ); +/* +* PARAMETERS +*	p_file_name +* 		[in] Name of the
>>> source file initiating the allocation. +* +*	line_num +*		[in] Line
>>> number in the specified file where the allocation is +* 	initiated +*
>>> +*	size +*		[in] Size of the requested allocation. +* +* 	pageable
>>> +*		[in] On operating systems that support pageable vs. non pageable
>>> +*		memory in the kernel, set to TRUE to allocate memory from paged
>>> pool. +*     tag +*		[in] An optional ASCII string, describing the
>>> name of memory block (or the owner) +* +* RETURN VALUES +*	Pointer to
>>> allocated memory if successful. +* +*	NULL otherwise. +* +* NOTES
>>> +*	Allocated memory follows alignment rules specific to the different
>>> +*	environments. +*	This function should not be called directly. The
>>> cl_zalloc macro will +*	redirect users to this function when memory
>>> tracking is enabled. +* +* SEE ALSO +*	Memory Management,
>>> __cl_zalloc_ntrk, __cl_malloc_trk, __cl_free_trk +**********/ + +
>>> 
>>>  /****i* Component Library: Memory Management/__cl_malloc_ntrk * NAME
>>>  @@ -933,6 +1033,18 @@ cl_copy_from_user( #define cl_pzalloc( a ) 	\
>>>  	__cl_zalloc_trk( __FILE__, __LINE__, a, TRUE )
>>> +#define cl_malloc_ex( a, tag )	\
>>> +	__cl_malloc_trk_ex( __FILE__, __LINE__, a, FALSE, tag )
>>> +
>>> +#define cl_zalloc_ex( a, tag )	\
>>> +	__cl_zalloc_trk_ex( __FILE__, __LINE__, a, FALSE, tag )
>>> +
>>> +#define cl_palloc_ex( a, tag )	\
>>> +	__cl_malloc_trk_ex( __FILE__, __LINE__, a, TRUE, tag )
>>> +
>>> +#define cl_pzalloc_ex( a, tag )	\
>>> +	__cl_zalloc_trk_ex( __FILE__, __LINE__, a, TRUE, tag )
>>> +
>>>  #define cl_free( a )	\
>>>  	__cl_free_trk( a )
>>> @@ -949,6 +1061,14 @@ cl_copy_from_user(
>>> 
>>>  #define cl_pzalloc( a )	\
>>>  	__cl_zalloc_ntrk( a, TRUE )
>>> +
>>> +#define cl_malloc_ex( a, tag )	cl_malloc( a )
>>> +
>>> +#define cl_zalloc_ex( a, tag )	cl_zalloc( a )
>>> +
>>> +#define cl_palloc_ex( a, tag )	cl_palloc( a )
>>> +
>>> +#define cl_pzalloc_ex( a, tag )	cl_pzalloc( a )
>>> 
>>>  #define cl_free( a )	\
>>>  	__cl_free_ntrk( a )
>>> SH: Do not extend complib even more.  It needs to go away and be
>>> replaced with native calls (which any Windows developer would
>>> recognize), not added to.
>>> 
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/inc/iba/ib_al_ioctl.h branches\mlx4/inc/iba/ib_al_ioctl.h
>>> --- trunk/inc/iba/ib_al_ioctl.h	2011-09-13 09:15:48.686157600 -0700
>>> +++ branches\mlx4/inc/iba/ib_al_ioctl.h	2011-10-10 16:59:19.579737300
>>> -0700 @@ -3479,7 +3479,7 @@ typedef struct _ual_ndi_req_cm_ioctl_in
>>>  	uint64_t					h_qp;
>>>  	net64_t						guid;
>>>  	net32_t						cid;
>>> -	uint16_t					dst_port;
>>> +	net16_t						dst_port;
>>>  	uint8_t						resp_res;
>>>  	uint8_t						init_depth;
>>>  	uint8_t						prot;
>>> diff -up -r -X \users\mshefty\scm\ofw\gen1\trunk\docs\dontdiff.txt -I
>>> '\$Id' trunk/inc/iba/ib_ci.h branches\mlx4/inc/iba/ib_ci.h ---
>>> trunk/inc/iba/ib_ci.h	2011-09-13 09:15:48.704159400 -0700 +++
>>> branches\mlx4/inc/iba/ib_ci.h	2011-10-10 16:59:19.596739000 -0700 @@
>>> -74,7 +74,9 @@ extern "C"
>>>   * definition.
>>>   */
>>>  #define VERBS_MAJOR_VER			(0x0002)
>>> -#define VERBS_MINOR_VER			(0x0002)
>>> +#define VERBS_MINOR_VER			(0x0004)
>>> +#define VERBS_EX_MAJOR_VER		(0x0001)
>>> +#define VERBS_EX_MINOR_VER		(0x0000)
>>> 
>>>  #define VERBS_VERSION			(((VERBS_MAJOR_VER) << 16) |
>>>  (VERBS_MINOR_VER)) #define MK_VERBS_VERSION(maj,min) 	((((maj) &
>>>  0xFFFF) << 16) | \
>>> SH: Can't this be done without breaking every application?  What is
>>> the plan for supporting existing applications?
>>> 
>>> @@ -164,7 +166,7 @@ typedef void
>>>  * NOTES
>>>  *	The consumer only gets the cq_context. It is the client
>>>  *	responsibility to store the cq_handle in the context after the creation
>>> -*	time. So it can call ci_poll_cq() after the arrival of the notification.
>>> +*	time. So it can call ci_poll_cq() or ci_poll_cq_array() after the arrival
>>> of the notification.
>>>  * SEE ALSO *	ci_create_cq ****** @@ -916,7 +918,7 @@ typedef
>>>  ib_api_status_t 	IN		const	ib_pd_handle_t 	h_pd, 	IN		const	void
>>>  	*qp_context, 	IN		const	ci_async_event_cb_t
>>> 	pfn_async_event_cb,
>>> -	IN		const	ib_qp_create_t
>>> 	*p_create_attr,
>>> +	IN	OUT			ib_qp_create_t
>>> 		*p_create_attr,
>>>  		OUT			ib_qp_attr_t 	*p_qp_attr, 	OUT 			ib_qp_handle_t 		*ph_qp,
>>>  	IN	OUT			ci_umv_buf_t
>>> 	*p_umv_buf OPTIONAL );
>>> @@ -934,7 +936,7 @@ typedef ib_api_status_t
>>>  *	pfn_async_event_cb
>>>  *		[in] Asynchronous event handler.
>>>  *	p_create_attr
>>> -*		[in] Initial attributes with which the qp must be created.
>>> +*		[in,out] Initial attributes with which the qp must be created.
>>>  *	p_qp_attr *		[out] Attributes of the newly created queue pair.
>>>  *	ph_qp @@ -961,6 +963,7 @@ typedef ib_api_status_t * 	Unreliable
>>>  datagram not supported *	IB_INVALID_PARAMETER *		The parameter
>>>  p_create_attr is invalid. +*		erroneous value p_create_attr fixed to
>>>  the max possible * NOTES *	If any of the initial parameters is not
>>>  valid, the queue pair is not *	created. If the routine call is not
>>>  successful then the contents of @@ -982,7 +985,7 @@ typedef
>>>  ib_api_status_t 	IN		const	uint8_t 	port_num,
>>>  	IN		const	void				*qp_context,
>> 	IN		const	ci_async_event_cb_t
>>> 	pfn_async_event_cb,
>>> -	IN		const	ib_qp_create_t
>>> 	*p_create_attr,
>>> +	IN	OUT			ib_qp_create_t
>>> 	*p_create_attr,
>>>  		OUT			ib_qp_attr_t		*p_qp_attr, 		OUT			ib_qp_handle_t 	*ph_qp );
>>>  /* @@ -1001,8 +1004,9 @@ typedef ib_api_status_t *
>>>  	pfn_async_event_cb *		[in] Asynchronous event handler.
>>>  *	p_create_attr
>>> -*		[in] Initial set of attributes with which the queue pair is to be
>>> -*		created. +*		[in,out] Initial set of attributes with which the
>>> queue pair is to be created. +*			     Upon invalid parameter function
>>> will return IB_INVALID_SETTING +*			     and change the parameter by
>>> max allowable value.
>>>  *	p_qp_attr *		[out] QP attributes after the qp is successfully
>>>  created. * @@ -1230,7 +1234,7 @@ typedef ib_api_status_t * 	pending
>>>  callbacks returning back to the verbs provider driver. * *	If the CQ
>>>  associated with this QP is still not destroyed, the
>>> completions
>>> -*	on behalf of this QP can still be pulled via the ci_poll_cq() call. Any
>>> +*	on behalf of this QP can still be pulled via the ci_poll_cq() or
>>> ci_poll_cq_array() call. Any
>>>  *	resources allocated by the Channel Interface must be deallocated as
>>>  part *	of this call. * SEE ALSO @@ -1289,7 +1293,83 @@ typedef
>>>  ib_api_status_t *		one of the parameters was NULL. * NOTES * 	The
>>>  consumer would need a way to retrieve the cq_handle
>>> associated with -*	context being returned, so it can perform
>>> ci_poll_cq() to retrieve +*	context being returned, so it can perform
>>> ci_poll_cq() or ci_poll_cq_array() to retrieve +*	completion queue
>>> entries. The handle as such is not being passed, since +*	there is no
>>> information in the handle that is visible to the consumer. +*	Passing
>>> a context directly would help avoid any reverse lookup that the
>>> +*	consumer would need to perform in order to identify it's own
>>> internal +*	data-structures	needed to process this completion
>>> completely. +* SEE ALSO +*	ci_destroy_cq, ci_query_cq, ci_resize_cq
>>> +****** +*/ + + +/****** struct ci_group_affinity_t +* mask -
>>> Specifies the affinity mask. The bits in the affinity mask identify a
>>> set of processors within the group identified by 'group' field. +*
>>> group - Specifies the group number. +*/ +typedef struct
>>> _ib_group_affinity +{ +	uint64_t 			mask; +	uint16_t 			group; +}
>>> 	ib_group_affinity_t; + +/****f* Verbs/ci_create_cq +* NAME
>>> +*	ci_create_cq_ex -- Create a completion queue (CQ) on the specified
>>> HCA with specified affinity. +* SYNOPSIS +*/ + +typedef
>>> ib_api_status_t +(*ci_create_cq_ex) ( +	IN		const	ib_ca_handle_t
>>> 	h_ca, +	IN 	const 	void 	*cq_context, +	IN		const	ci_async_event_cb_t
>>> 	pfn_async_event_cb, +	IN				ci_completion_cb_t
>> 	completion_cb, +	IN
>> 	ib_group_affinity_t
>>>       *affinity, +	IN	OUT			uint32_t* const
>> 		p_size,
>>> +		OUT			ib_cq_handle_t
>> 	*ph_cq, +	IN	OUT			ci_umv_buf_t
>> 	*p_umv_buf
>>> OPTIONAL );
>>> 
>>> SH: If we need a new call to create CQs, we should just pass in a
>>> structure, so that it can be more easily extended later.
>>> 
>>> +/* +* DESCRIPTION +*	The consumer must specify the minimum number of
>>> entries in the CQ. The +*	exact number of entries the Channel
>>> Interface created is returned to the +*	client. If the requested
>>> number of entries is larger than what this +*	HCA can support, an
>>> error is returned. +* PARAMETERS +*	h_ca +*		[in] A handle to the open
>>> HCA +*	cq_context +*		[in] The context that is passed during the
>>> completion callbacks. +*	pfn_async_event_cb +*		[in] Asynchronous
>>> event handler. +*	completion_cb +*		[in] Callback for completion
>>> events +* 	affinity +*		[in] CQ affinity +*	p_size +*		[in out] Points
>>> to a variable containing the number of CQ entries +*		requested by the
>>> consumer. On completion points to the size of the +*		CQ that was
>>> created by the provider. +*	ph_cq +*		[out] Handle to the newly
>>> created CQ on successful creation. +*	p_umv_buf +*		[in out] Vendor
>>> specific parameter to support user mode IO. +* RETURN VALUE
>>> +*	IB_SUCCESS +* 	The operation was successful.
>>> +*	IB_INVALID_CA_HANDLE +* 		The h_ca passed is invalid.
>>> +*	IB_INSUFFICIENT_RESOURCES +* 	Insufficient resources to complete
>>> request. +*	IB_INVALID_CQ_SIZE +* 	Requested CQ Size is not supported.
>>> +*	IB_INVALID_PARAMETER +*		one of the parameters was NULL. +* NOTES
>>> +*	The consumer would need a way to retrieve the cq_handle associated
>>> with +*	context being returned, so it can perform ci_poll_cq() or
>>> ci_poll_cq_array() to retrieve
>>>  *	completion queue entries. The handle as such is not being passed,
>>>  since *	there is no information in the handle that is visible to the
>>>  consumer. *	Passing a context directly would help avoid any reverse
>>>  lookup that
>>> the
>>> @@ -1353,6 +1433,41 @@ typedef ib_api_status_t
>>>  ******
>>>  */
>>> +typedef ib_api_status_t
>>> +(*ci_modify_cq) (
>>> +	IN		const	ib_cq_handle_t
>>> 	h_cq,
>>> +	IN 		uint16_t
>>> 		moder_cnt,
>>> +	IN      uint16_t
>>> 	moder_time,
>>> +	IN	OUT			ci_umv_buf_t
>>> 	*p_umv_buf OPTIONAL );
>>> 
>>> SH: This needs a difference name than 'modify CQ', but I really
>>> question whether this functionality is something that should be exposed
>>> above verbs. This seems more like a feature of how the CQ is armed,
>>> versus its existence.
>>> 
>>> +/* +* DESCRIPTION +*	This routine allows the caller to modify CQ
>>> interrupt moderation. +* PARAMETERS +*	h_cq +*		[in] Completion Queue
>>> handle +*	moder_cnt +*		[in] This parameter indicates the requested
>>> interrupt moderation count. +*	moder_time +*		[in] This parameter
>>> indicates the requested time interval between indicated interrupts.
>>> +*	p_umv_buf +*		[in out] Vendor specific parameter to support user
>>> mode IO. +* RETURN VALUE +*	IB_SUCCESS +*		The modify operation was
>>> successful. +*	IB_INVALID_CQ_HANDLE +*		The CQ handle is invalid.
>>> +*	IB_INSUFFICIENT_RESOURCES +*		Insufficient resources to complete
>>> request. +*	IB_INVALID_PARAMETER +*		one of the parameters was NULL.
>>> +* +* NOTES +* +* SEE ALSO +*	ci_create_cq +****** +*/ +
>>>  /****f* Verbs/ci_query_cq * NAME *	ci_query_cq -- Query the number of
>>>  entries configured for the CQ. @@ -2273,7 +2388,7 @@ typedef
>>>  ib_api_status_t *	on different types of queue pairs, and the
>>>  different modifiers *	acceptable for the work request for different
>>>  QP service types. * SEE ALSO
>>> -*	ci_post_recv, ci_poll_cq
>>> +*	ci_post_recv, ci_poll_cq, ci_poll_cq_array
>>>  ****** */ @@ -2365,7 +2480,7 @@ typedef ib_api_status_t * 	QP was in
>>>  reset or init state. *		(TBD: there may be an errata that allows
>>>  posting in init state) * SEE ALSO
>>> -*	ci_post_send, ci_poll_cq.
>>> +*	ci_post_send, ci_poll_cq, ci_poll_cq_array
>>>  ****** */ @@ -2406,7 +2521,7 @@ typedef ib_api_status_t *	is optional
>>>  by a channel adapter vendor. * * SEE ALSO
>>> -*	ci_create_cq, ci_poll_cq, ci_enable_cq_notify,
>>> ci_enable_ncomp_cq_notify +*	ci_create_cq, ci_poll_cq,
>>> ci_poll_cq_array, ci_enable_cq_notify, ci_enable_ncomp_cq_notify
>>>  *****/
>>>  
>>>  /****f* Verbs/ci_poll_cq @@ -2450,6 +2565,57 @@ typedef
>>>  ib_api_status_t ****** */
>>> + +/****f* Verbs/ci_poll_cq_array +* NAME +*	ci_poll_cq_array --
>>> Retrieve a work completion record from a completion queue +* SYNOPSIS
>>> +*/ + +typedef ib_api_status_t +(*ci_poll_cq_array) (
>>> +	IN		const	ib_cq_handle_t 	h_cq, +	IN	OUT 			int* 		p_num_entries,
>>> +		OUT			ib_wc_t*	const 	wc ); +/* +* DESCRIPTION +*	This routine
>>> retrieves a work completion entry from the specified +*	completion
>>> queue. The contents of the data returned in a work completion +*	is
>>> specified in ib_wc_t. +* +* PARAMETERS +*	h_cq +* 	[in] Handle to the
>>> completion queue being polled. +*	p_num_entries +*		[in out] Pointer
>>> to a variable, containing the number of entries in the array. +*		On
>>> succeful return it will contain the number of filled entries of the
>>> array. +*	wc +*		[out] An array of workcompletions retrieved from the
>>> completion queue +*		and successfully processed. +* +* RETURN VALUE
>>> +*	IB_SUCCESS +*		Poll completed successfully and found N>0 entries.
>>> +*		The wc array then contains N entries filled and *p_num_entries is
>>> equal to N. +*	IB_INVALID_CQ_HANDLE +*		The cq_handle supplied is not
>>> valid. +*	IB_NOT_FOUND +*		There were no completion entries found in
>>> the specified CQ. +* +* NOTES +*	This function returns qp_context in
>>> the first field of WC structure. +*	This first field (p_next) is
>>> intended to link WCs in a list and is not supposed +*	to be used in an
>>> array of WCs. +*	qp_context is a value, defined by user upon
>>> create_qp. +*	This function is intended for use with SRQ when the new
>>> qp_context +*	returned value will to the QP, related to the
>>> completion. +* +* SEE ALSO +*	ci_create_cq, ci_post_send,
>>> ci_post_recv, ci_bind_mw +****** +*/ + +
>>>  /****f* Verbs/ci_enable_cq_notify
>>>  * NAME
>>>  *	ci_enable_cq_notify -- Invoke the Completion handler, on next entry
>>> added.
>>> @@ -2482,12 +2648,12 @@ typedef ib_api_status_t
>>>  *	The consumer cannot call a request for notification without emptying
>>>  *	entries from the CQ. i.e if a consumer registers for a notification
>>>  *	request in the completion callback before pulling entries from the
>>> -*	CQ via ci_poll_cq, the notification is not generated for completions
>>> +*	CQ via ci_poll_cq or ci_poll_cq_array, the notification is not
>>> generated for completions
>>>  *	already in the CQ. For e.g. in the example below, if there are no calls
>>> -*   to ci_poll_cq()	after the ci_enable_cq_notify(). For any CQ entries
>>> added +*   to ci_poll_cq() or ci_poll_cq_array() after the
>>> ci_enable_cq_notify(). For any CQ entries added
>>>  *	before calling this ci_enable_cq_notify() call, the consumer does not
>>>  *	get a completion notification callback. In order to comply with the
>>> verb
>>> -*	spec, consumer is supposed to perform a ci_poll_cq() after the
>>> +*	spec, consumer is supposed to perform a ci_poll_cq() or
>>> ci_poll_cq_array() after the
>>>  *	ci_enable_cq_notify() is made to retrive any entries that might have
>>>  *	been added to the CQ before the CI registers the notification enable.
>>>  * @@ -2548,7 +2714,7 @@ typedef ib_api_status_t *	vendor. * *
>> SEE ALSO
>>> -*	ci_create_cq, ci_peek_cq, ci_poll_cq, ci_enable_cq_notify
>>> +*	ci_create_cq, ci_peek_cq, ci_poll_cq, ci_poll_cq_array,
>>> ci_enable_cq_notify
>>>  ****** */ @@ -2774,6 +2940,116 @@ typedef ib_api_status_t
>>>  *	ci_register_smr, ci_create_mw, ib_ci_op_t *****/
>>> + +/****f* Verbs/ci_alloc_fast_reg_mr +* NAME +*
> 	ci_alloc_fast_reg_mr _______________________________________________
> ofw mailing list ofw at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw



More information about the ofw mailing list