[ofiwg] help analyzing ring buffer implementation

Thu Sep 21 16:10:51 PDT 2017

On Thu, Sep 21, 2017 at 10:54:21PM +0000, Hefty, Sean wrote:

> For reference, the ring buffer is written to allow simultaneous
> access reading and writing to the ring buffer.  Serialization of
> readers and writers is left up to the caller.

reading from memory being written to by another CPU without locking
must go through stdatomic.

The problem with the code as written is this:

        pthread_spin_lock(&my_ringbuf_ptr->wlock);
        ofi_rbwrite(&my_ringbuf_ptr->ringbuf, &data,
		sizeof(int));
        ofi_rbcommit(&my_ringbuf_ptr->ringbuf);
        pthread_spin_unlock(&my_ringbuf_ptr->wlock);

To work properly the store to rb->wcnt must be observed after the
writes in rbwrite.

The spin lock does not guarentee this, it only guarentees that another
thread that also grabs the lock will see the above happen atomically.

In other words on the read side, we have:

//        pthread_spin_lock(&my_ringbuf_ptr->wlock);
        empty = ofi_rbempty(&my_ringbuf_ptr->ringbuf);
//        pthread_spin_unlock(&my_ringbuf_ptr->wlock);

And, as written, it is allowed for this CPU to observe rb->wcnt being
updated before ofi_rbwrite has happened, and this would be a
malfunction.

You must either wrap all reads and writes in a lock to gain
serialization, or you must use stdatomic for the store and load sides
of the cnt values with an appropriate strength ordering mode.

I'm not sure what you 'fastlock' is these days, but it should also be
expressed in terms of stdatomic, see how I reworked the similar code
in rdma-core.

Jason