[ofa-general] mlx4: device driver tries to sync DMA memory it has not allocated

Eli Cohen eli at dev.mellanox.co.il
Mon Aug 10 01:45:27 PDT 2009


Looking at mlx4_write_mtt_chunk() I see that it calls
mlx4_table_find() with a pointer to single dma_addr_t - dma_handle -
while the dma addresses for the ICM memory is actually a list of
different addresses covering possibly different sizes. I think
mlx4_table_find() should be changed to support that, and then we can
use calls to dma_sync_single_for_cpu()/dma_sync_single_for_device()
with the correct dma addresses.
Roland, what do you think?

On Sat, Aug 08, 2009 at 07:49:22PM +0200, Bart Van Assche wrote:
> Hello,
> 
> Has anyone ever encountered a message like the one below ? This message was
> generated while booting a 2.6.30.4 kernel with CONFIG_DMA_API_DEBUG=y and
> before any out-of-tree kernel modules were loaded.
> 
> ------------[ cut here ]------------
> WARNING: at lib/dma-debug.c:635 check_sync+0x47c/0x4b0()
> Hardware name: P5Q DELUXE
> mlx4_core 0000:01:00.0: DMA-API: device driver tries to sync DMA memory it
> has not allocated [device address=0x0000000139482000] [size=4096 bytes]
> Modules linked in: snd_hda_codec_atihdmi snd_hda_codec_analog snd_hda_intel
> snd_hda_codec snd_hwdep snd_pcm snd_timer snd rtc_cmos soundcore i2c_i801
> rtc_core hid_belkin mlx4_core(
> +) rtc_lib sr_mod sg snd_page_alloc pcspkr button intel_agp i2c_core joydev
> serio_raw cdrom usbhid hid raid456 raid6_pq async_xor async_memcpy async_tx
> xor raid0 sd_mod crc_t10dif
> ehci_hcd uhci_hcd usbcore edd raid1 ext3 mbcache jbd fan ide_pci_generic
> ide_core ata_generic ata_piix pata_marvell ahci libata scsi_mod thermal
> processor thermal_sys hwmon
> Pid: 1325, comm: work_for_cpu Not tainted 2.6.30.4-scst-debug #6
> Call Trace:
>  [<ffffffff8039bc7c>] ? check_sync+0x47c/0x4b0
>  [<ffffffff80248b48>] warn_slowpath_common+0x78/0xd0
>  [<ffffffff80248bfc>] warn_slowpath_fmt+0x3c/0x40
>  [<ffffffff80517769>] ? _spin_lock_irqsave+0x49/0x60
>  [<ffffffff8039b8ab>] ? check_sync+0xab/0x4b0
>  [<ffffffff8039bc7c>] check_sync+0x47c/0x4b0
>  [<ffffffff802724ac>] ? mark_held_locks+0x6c/0x90
>  [<ffffffff8039be1d>] debug_dma_sync_single_for_cpu+0x1d/0x20
>  [<ffffffffa024a969>] mlx4_write_mtt+0x159/0x1e0 [mlx4_core]
>  [<ffffffffa0243c02>] mlx4_create_eq+0x222/0x650 [mlx4_core]
>  [<ffffffff8027281d>] ? trace_hardirqs_on+0xd/0x10
>  [<ffffffffa02441f5>] mlx4_init_eq_table+0x1c5/0x4a0 [mlx4_core]
>  [<ffffffffa0248b08>] mlx4_setup_hca+0x98/0x550 [mlx4_core]
>  [<ffffffffa0249891>] ? __mlx4_init_one+0x8d1/0x920 [mlx4_core]
>  [<ffffffffa0249331>] __mlx4_init_one+0x371/0x920 [mlx4_core]
>  [<ffffffffa024df18>] mlx4_init_one+0x22/0x44 [mlx4_core]
>  [<ffffffff8025cd90>] ? do_work_for_cpu+0x0/0x30
>  [<ffffffff803a43e2>] local_pci_probe+0x12/0x20
>  [<ffffffff8025cda3>] do_work_for_cpu+0x13/0x30
>  [<ffffffff802613e6>] kthread+0x56/0x90
>  [<ffffffff8020cffa>] child_rip+0xa/0x20
>  [<ffffffff8020c9c0>] ? restore_args+0x0/0x30
>  [<ffffffff80261390>] ? kthread+0x0/0x90
>  [<ffffffff8020cff0>] ? child_rip+0x0/0x20
> ---[ end trace 4480af29bc755c6a ]---
> 
> Bart.

> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



More information about the general mailing list