[openib-general] uAT issues after SM node bounced

Arlin Davis ardavis at ichips.intel.com
Fri Aug 19 13:48:56 PDT 2005


Hal,

Sean and I are seeing some issues with uAT when our dedicated SM node 
bounces. Sean saw a kernel oops (I will let him send output) and I see 
the following console message with my failing ib_at_ips_by_gid requests :

 ib_at: ib_dev_ats_op: dev (ffffffff880775c0) ib0 already has pending op 21

Here are my 5 retries from the ib_at_ips_by_gid() call:

 open_hca: GID subnet fe80000000000000 id 0002c9020000409d
 get_hca_addr: ips_by_gid ret 0 at_rec 0x7fffffff9630 -> id 31788
 ip_comp_handler: at_rec 0x7fffffff9630 ->id 31788 id 31788 rec_num -22 
30803000
 ip_comp_handler: resolution err -22 retry 1
 ip_comp_handler: NEW ips_by_gid ret 0 at_rec 0x7fffffff9630 -> id 31789
 at_thread: callback woke
 ip_comp_handler: at_rec 0x7fffffff9630 ->id 31789 id 31789 rec_num -22 0
 ip_comp_handler: resolution err -22 retry 2
 ip_comp_handler: NEW ips_by_gid ret 0 at_rec 0x7fffffff9630 -> id 31790
 at_thread: callback woke
 ip_comp_handler: at_rec 0x7fffffff9630 ->id 31790 id 31790 rec_num -22 0
 ip_comp_handler: resolution err -22 retry 3
 ip_comp_handler: NEW ips_by_gid ret 0 at_rec 0x7fffffff9630 -> id 31791
 at_thread: callback woke
 ip_comp_handler: at_rec 0x7fffffff9630 ->id 31791 id 31791 rec_num -22 0
 ip_comp_handler: resolution err -22 retry 4
 ip_comp_handler: ERR: at_rec  0x7fffffff9630, req_id 31791 rec_num -22
 at_thread: callback woke
 open_hca: IB get ADDR failed for mthca0

Sometimes if I bounce the IPoIB device (ifconfig down/up) it starts 
working, other times it does not.

-arlin




More information about the general mailing list