[openib-general] Re: SDP: still getting sk_alloc() panic, any ideas?

Libor Michalek libor at topspin.com
Mon Jun 27 11:17:21 PDT 2005


On Thu, Jun 23, 2005 at 01:06:47PM -0700, Tom Duffy wrote:
> I am still getting the panic when you try to connect to a machine and
> it is not listening (but has ib_sdp loaded):
> 
> [root at sins-stinger-10 ~]# ----------- [cut here ] --------- [please
> bite here ] ---------
> Kernel BUG at "/build1/tduffy/openib-work/linux-2.6.12-openib/in:352
> invalid operand: 0000 [1] SMP
> CPU 1
> 
> Any idea about this?

  Sorry, I've been out of the office for a few days. 

  The problem is that each call to sk_alloc() is grabbing a reference to
the module, but it checks to make sure that there already is at least one
reference, if not the top BUG is triggered. In the case of the passive
connection there are no other references to the module. You can see that
the problem goes away if you open just one socket, even if you don't
listen on it, and then try the failing passive connect. When a socket is
created it actually grabs two references to the module, one at the sock
level and one at the sk level. The first reference at the sock level does
not trigger the BUG since it's through another code path. (try_module_get
vs. __module_get) This is why we only hit this during passive connect
to a system that has no active SDP sockets.

  Not sure the right way to fix this, maybe check to see if the socket
table size (dev_root_s.sk_entry) is greater then 0 in sdp_cm_req_handler()
before even performing the alloc...

-Libor




More information about the general mailing list