[openib-general] Re: FW: [PATCH 1 of 3] mad: large RMPP support
Michael S. Tsirkin
mst at mellanox.co.il
Thu Feb 9 15:46:54 PST 2006
Quoting r. Sean Hefty <mshefty at ichips.intel.com>:
> Subject: Re: FW: [PATCH 1 of 3] mad: large RMPP support
>
> Roland Dreier wrote:
> >My rule of thumb is that we shouldn't rely on being able to allocate a
> >contiguous buffer bigger than 4 KB, but assuming we can allocate 4 KB
> >is fine. 4 KB is the lowest page size of any real architecture, and
> >if the kernel is out of free pages then any allocation is likely to
> >fail. Allocations of larger buffers may fail because of memory
> >fragmentation, even with plenty of free memory.
> >
> >That is: a 4 KB buffer is fine.
>
> Given this, I think that we'll need to go with the linked list then. Maybe
> something like:
>
> struct ib_mad_segment {
> struct list_head list;
> u8 data[0];
> };
>
> struct ib_mad_send_buf {
> ...
> void *mad; /* first segment */
> struct list_head rmpp_list;
> u32 segment_size;
> ...
> };
Given that the last segment has a different size, it seems cleaner
to just keep the segment size part of ib_mad_segment structure.
> I'm undecided about whether all MADs should use the rmpp_list, with *mad
> referencing the data of the first segment. This keeps the code consistent,
> but would result in the first segment being larger (256-bytes) than
> additional segments (say 220-bytes).
I dont htink its a good idea.
Recall that when you send segments starting from the second one,
you need the header from *mad. So this gets ugly very quickly.
> Users could then walk the list of buffers without calling a routine that
> needs to start at the beginning of the list every time.
>
> - Sean
On the other hand, it makes sense to keep the single mad case
as simple as possible. So that's a good reason to have the rmpp list
include segments starting from the second one.
--
Michael S. Tsirkin
Staff Engineer, Mellanox Technologies
More information about the general
mailing list