[openib-general] RE: [RFC] DAT 2.0 immediate data proposal

Tue Jan 24 14:42:13 PST 2006

ok, maybe we should backup and start over....

This is exactly why immediate data was initially proposed as an 
extension instead of general API. We start to penalize native IB 
features based on the requirements of other RDMA interfaces that have to 
emulate the feature anyway.  What prevents the next  RDMA interface that 
comes along from requiring other variations of the interface due to 
implementation implications?  This is an IB specific feature that does 
not map well on iWARP so lets just call it what it is and let IB 
providers supply immediate data capabilities via the extension interface.

-arlin

Caitlin Bestler wrote:

>>
>>Maybe we need to just go back to one model and always deliver
>>via the event? With the post_recv_immed requirements, other
>>transports have a mechanism to emulate and create the
>>necessary resources on the recv side to place idata and copy
>>to event when operation is completed. Would this work for iWARP?
>>
>>
>>
>>Two different models for receiving idata should be avoided if
>>at all possible.
>>
>>
>>
>>    
>>
>
>Always delivering by the event is not feasible for an iWARP vendor.
>If you are working over RDMAC verbs then the work completion is no
>longer accessible by the time the Work Completion is reaped. So copying
>from the receive buffer to the event does not work since the location
>of the receive buffer is now known only to the application.
>
>The same problem exists in the opposite direction for InfiniBand HCAs
>using standard verbs. They cannot copy from the CQE to the receive
>buffer.
>
>So the user is stuck checking a flag or the event type to know where
>their data is. This is not terribly user friendly, but it is the best
>that can be offered if we want to enable this optimization. The need
>to check the flag does reduce the value of the optimization though.
>
>
>  
>
>>
>>6. Is dto_completion_data xfer_length include immediate_data
>>size or not?
>>
>>
>>
>>no
>>
>>
>>
>>    
>>
>
>Then how does the receiver know how much data there is?
>
>Even if an iWarp Provider attempts to optimize immediate
>placement into the CQ, it will end up setting the xfer_length
>whenever the packet is received out of order.
>
>So it is far simpler for the application to simply know that
>the data will be in the buffer, and that the xfer_length will
>be set. It doesn't need to worry about whether they were set
>by the cq_poll verb or by the hardware.
>
>  
>
>>
>>11. Need to cleanup operation description to make it clear
>>that Send|RDMA_write and immediate data part
>>
>>is a single atomic operation. The current "followed by"
>>language is misleading.
>>
>>Make it explicit that there is a single local DTO completion
>>and single remote DTO completion.
>>
>>
>>
>>Ok, I will clean that up
>>
>>
>>    
>>
>
>The best mapping available over RDMAC-compliant firmware for
>an iWARP NIC would be to post two operations (RDMA Write followed
>by a short Send). That would require additional spacein the send
>and completion queues since a completion for the write can only
>be suppressed for a successful completion.
>
>Whether these extra slots were required would be an IA attribute.
>
>And the requirement is that nothing for that QP can come between
>the iWARP Write and the Send. How the provider does that is up
>to it. Options include locking over both posts and a composite
>work request. Anyone working over existing RDMAC-compliant
>verbs will have to use the first approach.
>
>
>  
>
>>12. Is your intension that post_recv_immed can ONLY except
>>immediate data and is not
>>
>>capable to recv any message?
>>
>>
>>
>>No, the intention is to extend the post_recv to handle 32bit
>>idata which may arrive with or without other send or rdma_write data.
>>
>>
>>
>>Does it make more sense to add a dto_flags to the existing post_recv?
>>
>>
>>    
>>
>
>How does this map to iWARP?
>
>When the data can be sent as an immediate OR as data, then when received
>it can be placed into the receive buffer or even potentially directly
>into the CQ when everything aligns just right.
>
>But an iWARP sender has to place the immediate value as the first
>four bytes of a Send message. There is no other mapping than makes
>sense. Shoving the rest of the message up is complex, as is using
>the last four bytes of the message since the last four bytes *could*
>cross a DDP Segment boundary, and would require the user to provide
>a buffer that was 4 bytes larger.
> 
>
>
>
>_______________________________________________
>openib-general mailing list
>openib-general at openib.org
>http://openib.org/mailman/listinfo/openib-general
>
>To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>
>  
>