[openib-general] Re: FW: [PATCH 1 of 3] mad: large RMPP support
Michael S. Tsirkin
mst at mellanox.co.il
Thu Feb 9 16:54:23 PST 2006
Sean, at least I am a bit confused at this point. Let's go back and
summarize the list of issues you see with the last patchset, OK?
As far as I can see, we decided that the list of segments is the right approach.
It also seems you are also inclining towards uniform handling of the first
segment and the rest of them, so I hope this means the simplification
achieved by always using an s/g list of size 2 is also accepted.
It seems to me the only issue left is the extra list walks needed when we
look up the segment by number. The simplest solution to that would
probably be tracking the chunk addressed by seg_num, or something
along these lines.
Right?
Some more comments below.
Quoting Sean Hefty <mshefty at ichips.intel.com>:
>
> Michael S. Tsirkin wrote:
> >Given that the last segment has a different size, it seems cleaner
> >to just keep the segment size part of ib_mad_segment structure.
>
> The last segment should provide any necessary padding, so that the
> resulting MAD is 256-bytes. Segments 2 through n should be the same size.
Uh, right. So segment size could just be a define.
> >I dont htink its a good idea.
> >Recall that when you send segments starting from the second one,
> >you need the header from *mad. So this gets ugly very quickly.
>
> This is true whether the first segment is in the rmpp_list or not. From a
> user's viewpoint, they can walk all segments using the list operations.
> Otherwise, they need to reference the first segment using *mad, then all
> other segments using a list. I do see the issue that the first segment
> requires an offset, whereas, others do not.
>
> >On the other hand, it makes sense to keep the single mad case
> >as simple as possible. So that's a good reason to have the rmpp list
> >include segments starting from the second one.
>
> For single segment MADs, the rmpp_list can be ignored by the user. It's
> just that the internal code can be easier. We won't have to special case
> tracking the last segment acked as being either referenced by *mad or a
> pointer to a segment.
>
> Hmm... okay, how about this idea? For single segment MADs, only *mad is
> used. For multiple segment MADs, *mad references the repeated header only.
> All data segments are in the rmpp_list.
I dont really think it matters that much how we shuffle the buffers around.
What matters to me is making the rmpp and mad code as simple as possible,
and hiding all these details away from the user.
> Does anything outside of userspace send a multi-segment RMPP MAD? Is it
> likely that a kernel component would need to?
>
> - Sean
>
Sean, I think Jack's patch solves all these issues quite nicely:
- There's an API function to get a MAD segment by index,
so that users never have to know about how RMPP works -
they get a pointer and size and can fill it in.
- Anyone can fill these buffers: user_mad or another kernel component.
- For sending data, there's a unified approach by using a s/g list
of size 2 uniformly.
--
Michael S. Tsirkin
Staff Engineer, Mellanox Technologies
More information about the general
mailing list