Bug 42806

Summary: apr_atomic problems
Product: APR Reporter: Davi Arnaut <davi>
Component: APRAssignee: Apache Portable Runtime bugs mailinglist <bugs>
Status: RESOLVED LATER    
Severity: normal    
Priority: P2    
Version: HEAD   
Target Milestone: ---   
Hardware: Other   
OS: other   
Attachments: generic mutex backed implementation of apr_atomic operations
compiler provided atomic builtins
ia32 atomics
solaris atomic operations
powerpc atomic operations
remove volatile from apr_atomic_* prototypes
mutex implementation volatile qualifier removal
builtins implementation volatile qualifier removal
ia32 implementation volatile qualifier removal
ppc implementation volatile qualifier removal
solaris implementation volatile qualifier removal
win32 implementation volatile qualifier removal
netware implementation volatile qualifier removal
os390 implementation volatile qualifier removal
remove volatile usage on the atomic test code
apr_atomic_xchgptr implementation
apr_atomic_xchgptr implementation (+ppc)

Description Davi Arnaut 2007-07-03 18:11:44 UTC
Following some discussions [1] on the list regarding the apr_atomic
implementation, this patchset tries to address some of the problems
by reorganizing the various backends.

Almost all points raised in [1] are addressed, in special, the unsafe
mixing of mutex and asm (direct) backed atomics. For example, the
apr_atomic_casptr function which was only mutex backed and is used
by the worker and event mpms in httpd.

The patchs still needs more wide testing, but any comments or feedback
would be greatly appreciated.

1. http://mail-archives.apache.org/mod_mbox/apr-
dev/200610.mbox/%3C20061020153547.GA21975@p15097255.pureserver.info%3E
Comment 1 Davi Arnaut 2007-07-03 18:17:58 UTC
Created attachment 20437 [details]
generic mutex backed implementation of apr_atomic operations

The generic implementation is used when no specialized implementation
fits the defined criterias, or when the user forces the choice.
Comment 2 Davi Arnaut 2007-07-03 18:23:06 UTC
Created attachment 20438 [details]
compiler provided atomic builtins

Given a modern compiler, this patch provides fast atomic operations on various
platforms (alpha, ia32, ia64, powerpc, etc).

Tested on:

2x Pentium D, Ubuntu 7.04, gcc 4.1.2
16x Itanium II, RHEL 5, gcc 4.1.1
Comment 3 Davi Arnaut 2007-07-03 18:24:40 UTC
Created attachment 20439 [details]
ia32 atomics

ia32 (x86 and x86_64) atomic operations, plus apr_atomic_casptr
Comment 4 Davi Arnaut 2007-07-03 18:25:25 UTC
Created attachment 20440 [details]
solaris atomic operations

Tested on a Solaris 10 zone, i386.
Comment 5 Davi Arnaut 2007-07-03 18:26:02 UTC
Created attachment 20441 [details]
powerpc atomic operations

Tested on a PowerPC G4 (macbook).
Comment 6 Davi Arnaut 2007-07-03 20:10:11 UTC
Just a quick note that I'll address the issues with volatile, memory barriers
and such, once the initial patches have been committed.
Comment 8 Davi Arnaut 2007-07-05 12:13:13 UTC
Created attachment 20456 [details]
remove volatile from apr_atomic_* prototypes

This patch get rid of the "volatile" qualifier in the apr_atomic functions
prototypes. As APR claims to provide proper memory barriers, the "volatile"
qualifier is unnecessary, since the use of memory barriers already implies
ordering constraints on memory operations.

Also a bogus const qualifier for the third argument of apr_atomic_casptr is
removed.
Comment 9 Davi Arnaut 2007-07-05 12:14:38 UTC
Created attachment 20457 [details]
mutex implementation volatile qualifier removal

Remove the volatile qualifiers for the mutex backed implementation of
apr_atomic. The mutex primitives act as memory barriers, synchronizing
memory access with respect to other threads. The removal of the volatile
qualifiers also speeds up, a little bit, non-threaded builds of APR.
Comment 10 Davi Arnaut 2007-07-05 12:15:29 UTC
Created attachment 20458 [details]
builtins implementation volatile qualifier removal

Remove the volatile qualifiers for the atomic builtins backed implementation
of apr_atomic. These builtins, except __sync_lock_test_and_set, are considered
a full barrier. Also use a full memory barrier on the read32/set32 functions
to force strong memory ordering.
Comment 11 Davi Arnaut 2007-07-05 12:16:19 UTC
Created attachment 20459 [details]
ia32 implementation volatile qualifier removal

Remove volatile qualifiers and issue memory barriers where appropriate to
prevent reordering.
Comment 12 Davi Arnaut 2007-07-05 12:17:00 UTC
Created attachment 20460 [details]
ppc implementation volatile qualifier removal

Remove volatile qualifiers and issue memory barriers where appropriate to
prevent reordering.
Comment 13 Davi Arnaut 2007-07-05 12:20:08 UTC
Created attachment 20461 [details]
solaris implementation volatile qualifier removal

Remove volatile qualifiers and issue memory barriers where appropriate to
prevent reordering. The solaris atomic man page says:

"The membar_consumer() function arranges for all loads issued before this point

in the code to be completed before any subsequent loads."

and

"The membar_producer() function arranges for all stores issued before this
point
in the code to reach global visibility before any stores that follow."

TODO: the other function may still need explicit memory barriers.
Comment 14 Davi Arnaut 2007-07-05 12:20:49 UTC
Created attachment 20462 [details]
win32 implementation volatile qualifier removal

Remove volatile qualifiers. The WIN32 Interlocked* functions by default
provide full memory barrier semantics.
Comment 15 Davi Arnaut 2007-07-05 12:22:32 UTC
Created attachment 20463 [details]
netware implementation volatile qualifier removal

Remove volatile qualifiers, and regarding to memory barriers, the Novell
developer kit documentation "NDK: Libraries for C (LibC), Volume 2; Section I,
General C Services; Atomic Functions", says:

"An atomic function is guaranteed to complete its memory operation before any
other processor or bus agent can access the same memory."

So I presume that the atomic functions issues memory barriers.
Comment 16 Davi Arnaut 2007-07-05 12:23:24 UTC
Created attachment 20464 [details]
os390 implementation volatile qualifier removal
Comment 17 Davi Arnaut 2007-07-05 12:24:08 UTC
Created attachment 20465 [details]
remove volatile usage on the atomic test code
Comment 18 Davi Arnaut 2007-07-05 12:26:23 UTC
Created attachment 20466 [details]
apr_atomic_xchgptr implementation

As requested by Curt Arnold, for the log4cxx project, implement
apr_atomic_xchgptr
that atomically exchanges a pair of pointer values.
Comment 19 Davi Arnaut 2007-07-05 14:22:09 UTC
Created attachment 20468 [details]
apr_atomic_xchgptr implementation (+ppc)

I forgot to refresh the patch. Only the OS/390 stub is intentionally missing.
Comment 20 Joe Orton 2007-07-06 00:17:51 UTC
This stuff should really be on dev@apr now.

Is removing the volatile qualifier a backwards-compat break? (see
http://apr.apache.org/versioning.html)
Comment 21 Davi Arnaut 2007-07-06 07:58:04 UTC
(In reply to comment #20)
> This stuff should really be on dev@apr now.

Just finishing a few things up before posting (z/Architecture).

> Is removing the volatile qualifier a backwards-compat break? (see
> http://apr.apache.org/versioning.html)

No. It's a ABI break, the patches are intended only for trunk (1.3.0+).
Comment 22 Davi Arnaut 2007-07-06 09:00:59 UTC
As pointed to me by Paul Querna, an ABI break between 1.2.x and 1.3.x
is not permitted and since I can't guarantee that it won't break the ABI
for th various supported compilers, I'll only move forward the barrier
related patches. Volatile removal is left for later (2.+).