[openib-general] Re: [PATCH] fmr support in mthca

Michael S. Tsirkin mst at mellanox.co.il
Tue Mar 22 08:33:44 PST 2005


Quoting r. Roland Dreier <roland at topspin.com>:
> Subject: Re: [PATCH] fmr support in mthca
> 
>     Michael> Good, glad to help. I will try to address your comments
>     Michael> next week (its already weekend here).
> 
> No problem.  Libor won't be back until Monday so I won't even try SDP
> until then.
> 
>     Roland> What if we just reserve something like 64K MPTs and MTTs
>     Roland> for FMRs and ioremap everything at driver startup?  That
>     Roland> would only use a few MB of vmalloc space and probably
>     Roland> simplify the code too.
> 
>     Michael> I dont like these pre-allocations - if someone is only
>     Michael> using SDP and IP over IB, it seems he wont need almost
>     Michael> any regular regions.  64K MTTs with 4K page size cover up
>     Michael> to 200MByte of memory.
> 
> We can bump up the numbers if you want.  Right now the default
> allocation is 1 << 20 MTT segments (8 << 20 MTT entries).  I see no
> problem with having 64K MPTs and 256 MTT segments reserved for FMRs by
> default.  That should be more than enough for a single HCA -- 256K MTT
> segments means that 2 million pages or 8 GB of IO could be in flight
> at a time, which doesn't seem like a harsh limit to me.
> 
> Ultimately we can make the allocations tunable at device init time,
> along with the rest of the parameters (number of QPs, number of CQs,
> etc).  I haven't seen much pressure to do that so far but it is
> definitely in my plans.
> 
>     Michael> My other problem with this approach was implementational:
>     Michael> existing allocator and table code can be passed reserved
>     Michael> parameter, but dont have the ability to allocate out of
>     Michael> that pool. So we'd have to allocate out of a separate
>     Michael> allocator, and take care so that keys do not
>     Michael> conflict. This gets a bit complicated.
> 
> I think this is the way to go.  Keys are easy to deal with -- in
> mthca_init_mr_table, we could just pass dev->limits.num_fmrs instead
> of dev->limits.reserved_mrws when initializing dev->mr_table.mpt_alloc,
> and then create a new table of size dev->limits.num_fmrs and reserve
> dev->limits.reserved_mrws out of that table.
> 
> The buddy allocator is a little more work but it needs to be cleaned
> up and encapsulated better anyway.  Once that's done we'd just have
> two buddy allocators.  The first one would cover all the MTT segments,
> and we'd first take out a chunk of that one to cover the reserved MTTs
> and then allocate another chunk that can hold whatever number of MTT
> segments we decide to use for FMRs.
> 
>     Michael> Maybe do something separate for 32 bit kernels (like -
>     Michael> disable FMR support)?
> 
> No FMRs on 32-bit kernels isn't going to fly.  It doesn't seem that
> hard to make things work on i386 so why not do it?
> 
>     Michael> Yes but for mtts the addresses may not be physically
>     Michael> contigious, unless we want to limit FMRs to PAGE_SIZE/8
>     Michael> MTTs, which means 512 MTTs, that is 2MByte with 4K FMR
>     Michael> page size.  And is it seems possible that even with this
>     Michael> limitation MTTs for a specific FMR start at non page
>     Michael> aligned boundary.
> 
> I think it's fine to limit an FMR to 512 MTT entries.  I'd have to
> look at the source to be sure of the exact numbers, but I know that
> for the Topspin stack, neither SDP nor SRP is using more than 32
> entries per FMR.  A limit of mapping 512 pages/2 MB per FMR seems
> fine.  I don't know of anyone using FMRs even close to that big.
> 
> Even if it turns out to be to small, I see no problem with adding a
> small array of something on the order of 2 or 4 MTT pages.
> 
> If we use the buddy allocator for MTT entries for FMRs, then alignment
> is OK.  The buddy allocator guarantees that objects will be aligned to
> their size, which means that the MTT segments will never cross a page
> boundary.
> 
>  - R.
> 

OK. I thought about it and I buy this design.
I'll prepare a patch along these lines.

MST

-- 
MST - Michael S. Tsirkin



More information about the general mailing list