[ofa-general] [PATCH RFC] RDMA: New Memory Extensions.

Ralph Campbell ralph.campbell at qlogic.com
Thu May 15 11:53:17 PDT 2008


On Thu, 2008-05-15 at 13:39 -0500, Steve Wise wrote:
> Ralph Campbell wrote:
> > On Wed, 2008-05-14 at 19:49 -0700, Roland Dreier wrote:
> >   
> >>  > So you want the page size specified in the fast_reg_page_list as
> >>  > opposed to when the page list is bound to the fast_reg mr (via
> >>  > post_send)?
> >>
> >> It's kind of the same thing, since the fast_reg_page_list is part of the
> >> send work request... the structures you have at the moment are:
> >>
> >>  > +		struct {
> >>  > +			u64				iova_start;
> >>  > +			struct ib_fast_reg_page_list	*page_list;
> >>  > +			int				fbo;
> >>  > +			u32				length;
> >>  > +			int				access_flags;
> >>  > +			struct ib_mr 			*mr;
> >>
> >> (side note... move this pointer up with the other pointers, so you don't
> >> end up with a hole in the structure due to alignment... or stick an int
> >> page_size in to fill the hole)
> >>
> >>  > +		} fast_reg;
> >>
> >>  > +struct ib_fast_reg_page_list {
> >>  > +	struct ib_device 	*device;
> >>  > +	u64			*page_list;
> >>  > +	int			page_list_len;
> >>  > +};
> >>
> >> is page_list_len the maximum length of the page_list, or is it filled in
> >> by the consumer?  The driver could figure out the length of the
> >> page_list for any given work request by looking at the MR length and the
> >> page_size I suppose.
> >>
> >>  - R.
> >>     
> >
> > I think Roland and Steve misunderstood what I was asking about
> > the struct ib_fast_reg_page_list * returned from
> > ib_alloc_fast_reg_page_list().
> >
> > The question is "what can the caller do with the pointer?"
> > Clearly, the caller can pass the pointer to
> > ib_post_send(IB_WR_FAST_REG_MR) and use the [LR]_Key in the
> > normal ways.
> >
> > Can the caller dereference the pointer and look at the
> > values in page_list[]? Are these values understood to be
> > a physical addresses that can be passed to phys_to_virt() for example?
> > Are they byte addresses always aligned to a page boundary?
> >
> >   
> 
> The caller must _fill in_ the values in the page list.  That's the whole 
> point.   IE all this func is doing is allocating the _memory_ to store 
> the page list that the caller is building.  The special function is 
> needed because some devices might need to DMA the page list array from 
> this memory as part of processing the FAST_REG_MR work request, and thus 
> needs to allocate it dma coherently.  The pointer returned is a kernel 
> virtual address and can be read from/written to by the caller.
> 
> > The reason I ask is that the address used with the [LR]_Key from
> > ib_get_dma_mr() has to be translated with ib_dma_map_single(), etc.
> > because the ipath driver doesn't necessarily use physical addresses
> > for the address in the send WQEs. Normally, the address in the
> > send WQE is a kernel virtual address so the ib_ipath driver can
> > memcpy() the data to the chip.
> >   
> 
> > Lets say that ib_ipath uses vmalloc() to allocate the pages
> > instead of dma_alloc_coherent(). As long as the ULP only uses
> > the page_list values as an uninterpreted number that is passed
> > back to the driver via subsequent verbs calls, it wouldn't
> > matter to the ULP what the number represents. But if the ULP
> > expects to be able to call some other kernel function to
> > map or translate that value, then the ULP has to know what
> > kind of number it represents, its size and alignment, etc.
> >   
> 
> 
> We're not talking about allocating the pages themselves. 
> 
> Here's an example (ignoring errors):
> 
> page_list = ib_alloc_fast_reg_page_list(device, 1);
> 
> v = get_free_page(GFP_KERNEL);
> 
> page_list->page_list[0] = ib_dma_map_single(device, v, PAGE_SIZE,
>                                                                 
> DMA_TO_DEVICE|DMA_FROM_DEVICE);
> 
> wr.opcode = IB_WR_FAST_REG_MR;
> wr.next = NULL;
> wr.send_flags = 0;
> wr.wr_id = 0xdeadbeef;
> wr.wr.fast_reg.mr = mr;
> wr.wr.fast_reg.page_list = page_list;
> wr.wr.fast_reg.page_size = PAGE_SIZE;
> wr.wr.fast_reg.page_list_len = 1;
> wr.wr.fast_reg.first_byte_offset = 0;
> wr.wr.fast_reg.iova_start = (u64)v;
> wr.wr.fast_reg.length = PAGE_SIZE;
> wr.wr.fast_reg.access_flags = IB_ACCESS_LOCAL_WRITE |
>                                                         
> IB_ACCESS_REMOTE_READ |
>                                                         
> IB_ACCESS_REMOTE_WRITE;
> 
> ib_post_send(qp, &wr, &bad_wr);

OK. Thanks for clarifying. This wasn't clear to me from the
original description but I understand now.




More information about the general mailing list