[openib-general] Kernel assertion

Woodruff, Robert J robert.j.woodruff at intel.com
Fri Dec 17 12:00:00 PST 2004


I was running a 4 node cluster running an MPI application over IPoIB and
one
of the system's died with the following messages logged to
/var/log/messages.
The svn rev. is 1355. 2 of the nodes are PCI-E cards and 2 nodes are
PCI-X.
The system that asserted was one of the PCI-E systems. 

Dec 17 11:45:33 iclust-16 rsh(pam_unix)[4605]: session closed for user
woody
Dec 17 11:46:30 iclust-16 kernel: ib_mthca 0000:04:00.0: SQ full (64
posted, 64 max, 0 nreq)
Dec 17 11:46:30 iclust-16 kernel: ib0: post_send failed
Dec 17 11:46:39 iclust-16 kernel: ib_mthca 0000:04:00.0: SQ full (64
posted, 64 max, 0 nreq)
Dec 17 11:46:39 iclust-16 kernel: ib0: post_send failed
Dec 17 11:46:39 iclust-16 kernel: KERNEL: assertion
(!atomic_read(&skb->users)) failed at net/core/dev.c (1616)
Dec 17 11:49:49 iclust-16 syslogd 1.4.1: restart.
 
Any ideas ?

woody



More information about the general mailing list