[openib-general] [PATCH] IB/core - ib_umad can cause address alignment fault on ia64

Michael S. Tsirkin mst at mellanox.co.il
Thu Jan 18 08:14:31 PST 2007


> Quoting John W. Marland <jwm at prairieinet.net>:
> Subject: Re: [PATCH] IB/core - ib_umad can cause address alignment fault on ia64
> 
> Michael S. Tsirkin wrote:
> 
> >>Quoting Ralph Campbell <ralph.campbell at qlogic.com>:
> >>Subject: [PATCH] IB/core - ib_umad can cause address alignment fault on ia64
> >>
> >>IB/core - ib_umad can cause address alignment fault
> >>
> >>In user_mad.c, the definition for struct ib_umad_packet includes
> >>struct ib_user_mad at an odd 32-bit offset.  When ib_umad_write()
> >>tries to assign rmpp_mad->mad_hdr.tid, there is an alignment fault on
> >>architectures which have strict alignment for load/stores.
> >>This patch fixes the problem by changing the offset on which
> >>struct ib_user_mad is defined within struct ib_umad_packet.
> >>
> >>Thanks go to John W. Marland <jwm at prairieinet.net> for finding this.
> >>
> >>Signed-off-by: Ralph Campbell <ralph.campbell at qlogic.com>
> >>
> >>diff -r b1128b48dc99 drivers/infiniband/core/user_mad.c
> >>--- a/drivers/infiniband/core/user_mad.c	Fri Jan 12 20:00:03 2007 +0000
> >>+++ b/drivers/infiniband/core/user_mad.c	Wed Jan 17 14:09:37 2007 -0800
> >>@@ -125,7 +125,7 @@ struct ib_umad_packet {
> >> 	struct ib_mad_send_buf *msg;
> >> 	struct ib_mad_recv_wc  *recv_wc;
> >> 	struct list_head   list;
> >>-	int		   length;
> >>+	long		   length;
> >> 	struct ib_user_mad mad;
> >> };
> >>    
> >>
> >
> >This does not make sense to me - do we have to replace all int fields with long
> >now? Looks like a compiler or makefile bug in your setup - struct fields should
> >be naturally aligned.
> >
> >  
> >
>     We should probably have given a more complete explanation. The 
> unaligned access hits in two places, that I've tracked down so far.
>     The one where it's easiest to see what's happening is in ib_umad_write.
> ______________________________________________________________________________________
>        if (!ib_response_mad(packet->msg->mad)) {
>                 tid = &((struct ib_mad_hdr *) packet->msg->mad)->tid;
>                 *tid = cpu_to_be64(((u64) agent->hi_tid) << 32 |
>                                    (be64_to_cpup(tid) & 0xffffffff));
> 
> ---> this line causes the access problem               
> rmpp_mad->mad_hdr.tid = *tid;
>         }
> ________________________________________________________________________________________
>     The rmpp_mad variable is an ib_rmpp_mad pointer that is initialized 
>     from the packet->mad.data early in the function.
>     Because the ib_umad_packet structure has a as it's last element an 
>     ib_user_mad structure, not a pointer to one, but the structure.
>     This means that the Data[0] declaration at the end of the ib_umad 
>     structure is forced onto a 4 byte boundary.

So the issue is that we are casting char *data which has no alignment guarantees
to 64 bit number. We really must find a way to force 64 bit alignment for
struct ib_user_mad all over. Would not something like the following simple trick work?

struct ib_user_mad_hdr {
	.............
} __attribute__((aligned (8)));



-- 
MST




More information about the general mailing list