[openib-general] Question on the best approach to debug an infiniband connection problem

Sean Hubbell shubbell at dbresearch.net
Wed Aug 24 06:24:28 PDT 2005


Hello,

  I was wondering if there is a "best practices" method to debug a 
possible infiniband connection. I am currently trying to send a message 
over infiniband ib0 interface and I continue to get transmit errors. 
Minus going through and seeing if the port state is active, I am at a 
loss to find out what the problem is. I did notice a lot of errors in 
the /var/log/osm.log which I have listed below for today:


Aug 24 08:19:10 [42FFF960] -> osm_report_notice: Reporting Generic 
Notice type:3 num:67 from LID:0x0001 
GID:0xfe80000000000000,0x0005ad000003d269
Aug 24 08:19:10 [42FFF960] -> osm_vendor_send: RMPP 0 length 112
Aug 24 08:19:10 [42FFF960] -> osm_mcmr_rcv_join_mgrp: ERR 1B11: method = 
SubnAdmSet,scope_state = 0x1, component mask = 0x0000000000010083, 
expected comp mask = 0x00000000000130c7.
Aug 24 08:19:10 [42FFF960] -> osm_vendor_send: RMPP 0 length 256
Aug 24 08:19:14 [42FFF960] -> osm_vendor_send: RMPP 0 length 112
Aug 24 08:19:14 [42FFF960] -> osm_vendor_send: RMPP 0 length 112
Aug 24 08:19:14 [42FFF960] -> osm_vendor_send: RMPP 0 length 112
Aug 24 08:19:14 [42FFF960] -> osm_vendor_send: RMPP 0 length 112
Aug 24 08:19:14 [42FFF960] -> osm_report_notice: Reporting Generic 
Notice type:3 num:67 from LID:0x0001 
GID:0xfe80000000000000,0x0005ad000003d269
Aug 24 08:19:14 [42FFF960] -> osm_report_notice: Reporting Generic 
Notice type:3 num:67 from LID:0x0001 
GID:0xfe80000000000000,0x0005ad000003d269
Aug 24 08:19:16 [447FF960] -> umad_receiver: recv error Interrupted 
system call
Aug 24 08:22:05 [AB441140] -> OpenSM Rev:openib-1.0.0
Aug 24 08:22:05 [AB441140] -> osm_opensm_init: Forcing single threaded 
dispatcher.
Aug 24 08:22:05 [AB441140] -> osm_report_notice: Reporting Generic 
Notice type:3 num:66 from LID:0x0000 
GID:0xfe80000000000000,0x0000000000000000
Aug 24 08:22:05 [AB441140] -> osm_report_notice: Reporting Generic 
Notice type:3 num:66 from LID:0x0000 
GID:0xfe80000000000000,0x0000000000000000
Aug 24 08:22:05 [AB441140] -> osm_vendor_get_all_port_attr: assign CA 
mthca0 port 1 guid (0x5ad000003d269) as the default port.
Aug 24 08:22:05 [AB441140] -> osm_vendor_bind: Binding to port 
0x5ad000003d269.
Aug 24 08:22:05 [AB441140] -> osm_vendor_bind: Binding to port 
0x5ad000003d269.
Aug 24 08:22:05 [42FFF960] -> __osm_trap_rcv_process_request: Received 
Generic Notice type:0x01 num:128 Producer:2 from LID:0x0002 
TID:0x0000000000000000
Aug 24 08:22:05 [42FFF960] -> osm_report_notice: Reporting Generic 
Notice type:1 num:128 from LID:0x0002 
GID:0xfe80000000000000,0x0002c9010bec5320
Aug 24 08:22:06 [42FFF960] -> __osm_trap_rcv_process_request: Received 
Generic Notice type:0x04 num:144 Producer:1 from LID:0x0001 
TID:0x0000000000000000
Aug 24 08:22:06 [42FFF960] -> osm_report_notice: Reporting Generic 
Notice type:4 num:144 from LID:0x0001 
GID:0xfe80000000000000,0x0005ad000003d269
Aug 24 08:22:12 [42FFF960] -> osm_vendor_send: RMPP 0 length 112
Aug 24 08:22:12 [42FFF960] -> osm_vendor_send: RMPP 0 length 112
Aug 24 08:22:12 [42FFF960] -> osm_vendor_send: RMPP 0 length 112


Thanks for any and all guidance in advance,

Sean




More information about the general mailing list