[ofa-general] page allocation failure
Bernd Schubert
bs at q-leap.de
Thu Feb 28 09:42:19 PST 2008
Hello,
on several on our Lustre Servers we can see page allocation failures.
This is with 2.6.22 + kernel modules from ofed 1.2.5
[44464.764559] Lustre: 24052:0:(ldlm_lib.c:698:target_handle_connect()) Skipped 16 previous similar messages
[54132.351263] ib_cm/2: page allocation failure. order:0, mode:0x10d0
[54132.360738]
[54132.360741] Call Trace:
[54132.367803] [<ffffffff8020ac61>] show_trace+0x34/0x47
[54132.373235] [<ffffffff8020ac86>] dump_stack+0x12/0x17
[54132.378937] [<ffffffff80251bc4>] __alloc_pages+0x2a3/0x2bc
[54132.386180] [<ffffffff8020f75c>] dma_alloc_pages+0x9b/0xbf
[54132.395120] [<ffffffff8020f7f6>] dma_alloc_coherent+0x76/0x1cc
[54132.401651] [<ffffffff8809af1e>] :ib_mthca:mthca_buf_alloc+0x1bd/0x2a3
[54132.408897] [<ffffffff8809f9a9>] :ib_mthca:mthca_alloc_qp_common+0x246/0x4e5
[54132.418884] [<ffffffff880a0c6d>] :ib_mthca:mthca_alloc_qp+0xab/0x102
[54132.425774] [<ffffffff880a5217>] :ib_mthca:mthca_create_qp+0x126/0x281
[54132.432716] [<ffffffff88054bc5>] :ib_core:ib_create_qp+0x17/0x91
[54132.439102] [<ffffffff88161c9f>] :rdma_cm:rdma_create_qp+0x2d/0x153
[54132.446301] [<ffffffff8835d0cc>] :ko2iblnd:kiblnd_create_conn+0x81c/0x1250
[54132.456992] [<ffffffff88365295>] :ko2iblnd:kiblnd_passive_connect+0x605/0xdd0
[54132.469847] [<ffffffff88366975>] :ko2iblnd:kiblnd_cm_callback+0x255/0xeb0
[54132.478821] [<ffffffff881620e7>] :rdma_cm:cma_req_handler+0x322/0x389
[54132.485637] [<ffffffff88155fa4>] :ib_cm:cm_process_work+0x17/0xad
[54132.492182] [<ffffffff88157025>] :ib_cm:cm_req_handler+0x7ae/0x81b
[54132.499236] [<ffffffff881570bf>] :ib_cm:cm_work_handler+0x2d/0xbaa
[54132.506690] [<ffffffff80236291>] run_workqueue+0x7f/0x10b
[54132.512652] [<ffffffff80236b1a>] worker_thread+0xda/0xe4
[54132.520136] [<ffffffff8023959a>] kthread+0x47/0x75
[54132.525570] [<ffffffff8020a2f8>] child_rip+0xa/0x12
[54132.532975]
[54132.535527] Mem-info:
[54132.538157] Node 0 DMA per-cpu:
[54132.542303] CPU 0: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
[54132.551752] CPU 1: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
[54132.561661] CPU 2: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
[54132.571154] CPU 3: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
[54132.580597] CPU 4: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
[54132.592354] CPU 5: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
[54132.601794] CPU 6: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
[54132.610719] CPU 7: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
[54132.619630] Node 0 DMA32 per-cpu:
[54132.623551] CPU 0: Hot: hi: 186, btch: 31 usd: 49 Cold: hi: 62, btch: 15 usd: 49
[54132.632691] CPU 1: Hot: hi: 186, btch: 31 usd: 26 Cold: hi: 62, btch: 15 usd: 3
[54132.642680] CPU 2: Hot: hi: 186, btch: 31 usd: 30 Cold: hi: 62, btch: 15 usd: 54
[54132.651897] CPU 3: Hot: hi: 186, btch: 31 usd: 1 Cold: hi: 62, btch: 15 usd: 13
[54132.663321] CPU 4: Hot: hi: 186, btch: 31 usd: 43 Cold: hi: 62, btch: 15 usd: 55
[54132.673282] CPU 5: Hot: hi: 186, btch: 31 usd: 30 Cold: hi: 62, btch: 15 usd: 49
[54132.683636] CPU 6: Hot: hi: 186, btch: 31 usd: 25 Cold: hi: 62, btch: 15 usd: 1
[54132.693156] CPU 7: Hot: hi: 186, btch: 31 usd: 13 Cold: hi: 62, btch: 15 usd: 56
[54132.703412] Node 0 Normal per-cpu:
[54132.707024] CPU 0: Hot: hi: 186, btch: 31 usd: 130 Cold: hi: 62, btch: 15 usd: 14
[54132.719317] CPU 1: Hot: hi: 186, btch: 31 usd: 81 Cold: hi: 62, btch: 15 usd: 1
[54132.729276] CPU 2: Hot: hi: 186, btch: 31 usd: 134 Cold: hi: 62, btch: 15 usd: 2
[54132.738819] CPU 3: Hot: hi: 186, btch: 31 usd: 124 Cold: hi: 62, btch: 15 usd: 8
[54132.748078] CPU 4: Hot: hi: 186, btch: 31 usd: 21 Cold: hi: 62, btch: 15 usd: 4
[54132.758029] CPU 5: Hot: hi: 186, btch: 31 usd: 30 Cold: hi: 62, btch: 15 usd: 9
[54132.766855] CPU 6: Hot: hi: 186, btch: 31 usd: 120 Cold: hi: 62, btch: 15 usd: 13
[54132.776462] CPU 7: Hot: hi: 186, btch: 31 usd: 166 Cold: hi: 62, btch: 15 usd: 12
[54132.786009] Active:28507 inactive:62701 dirty:8386 writeback:27 unstable:0
[54132.786010] free:5586 slab:273528 mapped:2136 pagetables:699 bounce:0
[54132.803082] Node 0 DMA free:11192kB min:20kB low:24kB high:28kB active:0kB inactive:0kB present:10660kB pages_scanned:0 all_unreclaimable? yes
[54132.816507] lowmem_reserve[]: 0 3255 4013
[54132.820811] Node 0 DMA32 free:9812kB min:6564kB low:8204kB high:9844kB active:52536kB inactive:134508kB present:3333728kB pages_scanned:0 all_unreclaimable? no
[54132.839252] lowmem_reserve[]: 0 0 757
[54132.843205] Node 0 Normal free:1340kB min:1524kB low:1904kB high:2284kB active:61492kB inactive:116296kB present:775680kB pages_scanned:800 all_unreclaimable? no
[54132.859932] lowmem_reserve[]: 0 0 0
[54132.863784] Node 0 DMA: 6*4kB 4*8kB 4*16kB 4*32kB 3*64kB 0*128kB 2*256kB 0*512kB 2*1024kB 0*2048kB 2*4096kB = 11192kB
[54132.876957] Node 0 DMA32: 48*4kB 33*8kB 26*16kB 3*32kB 1*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 2*4096kB = 9608kB
[54132.891138] Node 0 Normal: 0*4kB 0*8kB 1*16kB 1*32kB 0*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 1456kB
[54132.903195] Swap cache: add 0, delete 0, find 0/0, race 0+0
[54132.909967] Free swap = 4200888kB
[54132.913677] Total swap = 4200888kB
[54132.917229] Free swap: 4200888kB
[54132.967201] 1245184 pages of RAM
[54132.971121] 231685 reserved pages
[54132.974973] 58033 pages shared
[54132.978329] 0 pages swap cached
[54132.982267] LustreError: 4103:0:(o2iblnd.c:791:kiblnd_create_conn()) Can't create QP: -12
[54177.640441] ib_cm/5: page allocation failure. order:0, mode:0x10d0
[54177.648631]
[54177.648632] Call Trace:
[54177.653908] [<ffffffff8020ac61>] show_trace+0x34/0x47
[54177.660073] [<ffffffff8020ac86>] dump_stack+0x12/0x17
[54177.667176] [<ffffffff80251bc4>] __alloc_pages+0x2a3/0x2bc
[54177.682952] [<ffffffff8020f75c>] dma_alloc_pages+0x9b/0xbf
[54177.688811] [<ffffffff8020f7f6>] dma_alloc_coherent+0x76/0x1cc
[54177.695277] [<ffffffff8809af1e>] :ib_mthca:mthca_buf_alloc+0x1bd/0x2a3
[54177.702683] [<ffffffff8809c85f>] :ib_mthca:mthca_alloc_cq_buf+0x38/0x86
[54177.711034] [<ffffffff8809d7f6>] :ib_mthca:mthca_init_cq+0x12a/0x397
[54177.718478] [<ffffffff880a5462>] :ib_mthca:mthca_create_cq+0xf0/0x1be
[54177.725601] [<ffffffff88054c66>] :ib_core:ib_create_cq+0x27/0x56
[54177.732384] [<ffffffff8835cc60>] :ko2iblnd:kiblnd_create_conn+0x3b0/0x1250
[54177.739683] [<ffffffff88365295>] :ko2iblnd:kiblnd_passive_connect+0x605/0xdd0
[54177.748451] [<ffffffff88366975>] :ko2iblnd:kiblnd_cm_callback+0x255/0xeb0
[54177.757088] [<ffffffff881620e7>] :rdma_cm:cma_req_handler+0x322/0x389
[54177.763985] [<ffffffff88155fa4>] :ib_cm:cm_process_work+0x17/0xad
[54177.770664] [<ffffffff88157025>] :ib_cm:cm_req_handler+0x7ae/0x81b
[54177.777248] [<ffffffff881570bf>] :ib_cm:cm_work_handler+0x2d/0xbaa
[54177.784045] [<ffffffff80236291>] run_workqueue+0x7f/0x10b
[54177.790439] [<ffffffff80236b1a>] worker_thread+0xda/0xe4
[54177.799862] [<ffffffff8023959a>] kthread+0x47/0x75
[54177.805672] [<ffffffff8020a2f8>] child_rip+0xa/0x12
[54177.811717]
[54177.813851] Mem-info:
[54177.816666] Node 0 DMA per-cpu:
[54177.820479] CPU 0: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
[54177.829621] CPU 1: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
[54177.839216] CPU 2: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
[54177.849488] CPU 3: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
[54177.859625] CPU 4: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
[54177.871977] CPU 5: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
[54177.881930] CPU 6: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
[54177.891980] CPU 7: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0
[54177.902800] Node 0 DMA32 per-cpu:
[54177.906462] CPU 0: Hot: hi: 186, btch: 31 usd: 10 Cold: hi: 62, btch: 15 usd: 58
[54177.916162] CPU 1: Hot: hi: 186, btch: 31 usd: 26 Cold: hi: 62, btch: 15 usd: 3
[54177.926049] CPU 2: Hot: hi: 186, btch: 31 usd: 139 Cold: hi: 62, btch: 15 usd: 54
[54177.936948] CPU 3: Hot: hi: 186, btch: 31 usd: 1 Cold: hi: 62, btch: 15 usd: 13
[54177.946968] CPU 4: Hot: hi: 186, btch: 31 usd: 56 Cold: hi: 62, btch: 15 usd: 55
[54177.956868] CPU 5: Hot: hi: 186, btch: 31 usd: 30 Cold: hi: 62, btch: 15 usd: 57
[54177.965685] CPU 6: Hot: hi: 186, btch: 31 usd: 25 Cold: hi: 62, btch: 15 usd: 1
[54177.975412] CPU 7: Hot: hi: 186, btch: 31 usd: 13 Cold: hi: 62, btch: 15 usd: 56
[54177.986045] Node 0 Normal per-cpu:
[54177.990527] CPU 0: Hot: hi: 186, btch: 31 usd: 128 Cold: hi: 62, btch: 15 usd: 14
[54178.002993] CPU 1: Hot: hi: 186, btch: 31 usd: 81 Cold: hi: 62, btch: 15 usd: 1
[54178.012136] CPU 2: Hot: hi: 186, btch: 31 usd: 113 Cold: hi: 62, btch: 15 usd: 2
[54178.022533] CPU 3: Hot: hi: 186, btch: 31 usd: 124 Cold: hi: 62, btch: 15 usd: 8
[54178.032316] CPU 4: Hot: hi: 186, btch: 31 usd: 27 Cold: hi: 62, btch: 15 usd: 4
[54178.041380] CPU 5: Hot: hi: 186, btch: 31 usd: 24 Cold: hi: 62, btch: 15 usd: 9
[54178.050941] CPU 6: Hot: hi: 186, btch: 31 usd: 120 Cold: hi: 62, btch: 15 usd: 13
[54178.061180] CPU 7: Hot: hi: 186, btch: 31 usd: 166 Cold: hi: 62, btch: 15 usd: 12
[54178.072162] Active:28319 inactive:62389 dirty:8381 writeback:27 unstable:0
[54178.072163] free:5581 slab:273603 mapped:2117 pagetables:690 bounce:0
[54178.087805] Node 0 DMA free:11192kB min:20kB low:24kB high:28kB active:0kB inactive:0kB present:10660kB pages_scanned:0 all_unreclaimable? yes
[54178.103794] lowmem_reserve[]: 0 3255 4013
[54178.108294] Node 0 DMA32 free:9784kB min:6564kB low:8204kB high:9844kB active:51792kB inactive:133260kB present:3333728kB pages_scanned:0 all_unreclaimable? no
[54178.129648] lowmem_reserve[]: 0 0 757
[54178.133756] Node 0 Normal free:1348kB min:1524kB low:1904kB high:2284kB active:61484kB inactive:116296kB present:775680kB pages_scanned:728 all_unreclaimable? no
[54178.154399] lowmem_reserve[]: 0 0 0
[54178.158450] Node 0 DMA: 6*4kB 4*8kB 4*16kB 4*32kB 3*64kB 0*128kB 2*256kB 0*512kB 2*1024kB 0*2048kB 2*4096kB = 11192kB
[54178.172214] Node 0 DMA32: 65*4kB 17*8kB 37*16kB 6*32kB 0*64kB 0*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 2*4096kB = 9628kB
[54178.188210] Node 0 Normal: 0*4kB 1*8kB 1*16kB 1*32kB 0*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 1464kB
[54178.202288] Swap cache: add 0, delete 0, find 0/0, race 0+0
[54178.208654] Free swap = 4200888kB
[54178.212390] Total swap = 4200888kB
[54178.218597] Free swap: 4200888kB
[54178.264623] 1245184 pages of RAM
[54178.268302] 231685 reserved pages
[54178.271793] 57602 pages shared
[54178.275306] 0 pages swap cached
[54178.278778] LustreError: 4106:0:(o2iblnd.c:732:kiblnd_create_conn()) Can't create CQ: -12
[54277.772930] ib_cm/2: page allocation failure. order:0, mode:0x10d0
[54277.781944]
[54277.781945] Call Trace:
[54277.788321] [<ffffffff8020ac61>] show_trace+0x34/0x47
[54277.793761] [<ffffffff8020ac86>] dump_stack+0x12/0x17
[54277.799744] [<ffffffff80251bc4>] __alloc_pages+0x2a3/0x2bc
[54277.806044] [<ffffffff8020f75c>] dma_alloc_pages+0x9b/0xbf
[54277.814225] [<ffffffff8020f7f6>] dma_alloc_coherent+0x76/0x1cc
[54277.821449] [<ffffffff8809af1e>] :ib_mthca:mthca_buf_alloc+0x1bd/0x2a3
[54277.831300] [<ffffffff8809f9a9>] :ib_mthca:mthca_alloc_qp_common+0x246/0x4e5
[54277.838934] [<ffffffff880a0c6d>] :ib_mthca:mthca_alloc_qp+0xab/0x102
[54277.846467] [<ffffffff880a5217>] :ib_mthca:mthca_create_qp+0x126/0x281
[54277.854289] [<ffffffff88054bc5>] :ib_core:ib_create_qp+0x17/0x91
[54277.862274] [<ffffffff88161c9f>] :rdma_cm:rdma_create_qp+0x2d/0x153
[54277.870048] [<ffffffff8835d0cc>] :ko2iblnd:kiblnd_create_conn+0x81c/0x1250
[54277.877973] [<ffffffff88365295>] :ko2iblnd:kiblnd_passive_connect+0x605/0xdd0
[54277.886679] [<ffffffff88366975>] :ko2iblnd:kiblnd_cm_callback+0x255/0xeb0
[54277.895646] [<ffffffff881620e7>] :rdma_cm:cma_req_handler+0x322/0x389
[54277.903470] [<ffffffff88155fa4>] :ib_cm:cm_process_work+0x17/0xad
[54277.910567] [<ffffffff88157025>] :ib_cm:cm_req_handler+0x7ae/0x81b
[54277.918121] [<ffffffff881570bf>] :ib_cm:cm_work_handler+0x2d/0xbaa
[54277.926378] [<ffffffff80236291>] run_workqueue+0x7f/0x10b
[54277.932202] [<ffffffff80236b1a>] worker_thread+0xda/0xe4
[54277.938003] [<ffffffff8023959a>] kthread+0x47/0x75
[54277.944032] [<ffffffff8020a2f8>] child_rip+0xa/0x12
[54277.950581]
Any ideas?
Thanks,
Bernd
--
Bernd Schubert
Q-Leap Networks GmbH
More information about the general
mailing list