[openib-general] Unusable QP's on CM established connections from gen2 client to gen1 server.

Sean Hefty mshefty at ichips.intel.com
Fri Nov 10 12:05:58 PST 2006


> I tried to increase the retry_count but without success. In fact the
> maximum value I could set was 7.

Retry_count is a 3 bit value, so 7 is fine.

> The only thing that worries me is the timeout of 1 for the gen2 stack
> which is 12 for the gen1 stack.
> Is there a way to increase this? 

The gen2 stack calculates this value based on the packet lifetime (+1) in the 
path record.  A value of 1 is about 8 microseconds.  How did you obtain the path 
record that you passed into ib_send_cm_req()?  It looks like the value being 
passed in is 0.

I have no idea how gen1 gets this value, but it should be pulling it from the CM 
REQ message.  There's a disconnect between how it gets 12, while the other side 
has 1, that we probably need to understand.  Is the timeout still 1 when 
connecting gen2 to gen2?

> Question about psn:
> In my gen1 application I have no place where I explicitly the psn's. The
> psn's are either set by qp creation or the cm kernel code (I don't know
> and care it works!)
> In my gen2 code I copied an example where the
> ib_cm_req_param.starting_psn is explicitly set to the qp_num and the
> qp_attr.rq_psn is set to qp_num in the transition from Rts to Rtr.
> Whithout that last setting in Rts to Rtr even gen2 to gen2 does not
> work.
> Is there something that I'm missing?

Based on your output, your psn values look fine.

>                           ah_attr.is_global: 1

You're not actually trying to go between subnets are you?  I think is_global 
should be 0 here.  The gen2 stack sets this based on information from the path 
record.  What value is in the path record hop_limit?  (It may help if you just 
print out the path record passed into ib_send_cm_req().)

>                            ah_attr.port_num: 1
>                               min_rnr_timer: 4
>                                    port_num: 0

What is "port_num: 0" above?  ah_attr.port_num looks correct.  Also, for the 
gen1 side, you displayed "port: 1" and "av.port: 0".  I would expect the port 
number to be >= 1.

- Sean




More information about the general mailing list