Following some discussions [1] on the list regarding the apr_atomic implementation, this patchset tries to address some of the problems by reorganizing the various backends. Almost all points raised in [1] are addressed, in special, the unsafe mixing of mutex and asm (direct) backed atomics. For example, the apr_atomic_casptr function which was only mutex backed and is used by the worker and event mpms in httpd. The patchs still needs more wide testing, but any comments or feedback would be greatly appreciated. 1. http://mail-archives.apache.org/mod_mbox/apr- dev/200610.mbox/%3C20061020153547.GA21975@p15097255.pureserver.info%3E
Created attachment 20437 [details] generic mutex backed implementation of apr_atomic operations The generic implementation is used when no specialized implementation fits the defined criterias, or when the user forces the choice.
Created attachment 20438 [details] compiler provided atomic builtins Given a modern compiler, this patch provides fast atomic operations on various platforms (alpha, ia32, ia64, powerpc, etc). Tested on: 2x Pentium D, Ubuntu 7.04, gcc 4.1.2 16x Itanium II, RHEL 5, gcc 4.1.1
Created attachment 20439 [details] ia32 atomics ia32 (x86 and x86_64) atomic operations, plus apr_atomic_casptr
Created attachment 20440 [details] solaris atomic operations Tested on a Solaris 10 zone, i386.
Created attachment 20441 [details] powerpc atomic operations Tested on a PowerPC G4 (macbook).
Just a quick note that I'll address the issues with volatile, memory barriers and such, once the initial patches have been committed.
Patches committed to trunk, revisions: 553289, 553290, 553291, 553292 and 553293. http://svn.apache.org/viewvc?view=rev&revision=553289 http://svn.apache.org/viewvc?view=rev&revision=553290 http://svn.apache.org/viewvc?view=rev&revision=553291 http://svn.apache.org/viewvc?view=rev&revision=553292 http://svn.apache.org/viewvc?view=rev&revision=553293
Created attachment 20456 [details] remove volatile from apr_atomic_* prototypes This patch get rid of the "volatile" qualifier in the apr_atomic functions prototypes. As APR claims to provide proper memory barriers, the "volatile" qualifier is unnecessary, since the use of memory barriers already implies ordering constraints on memory operations. Also a bogus const qualifier for the third argument of apr_atomic_casptr is removed.
Created attachment 20457 [details] mutex implementation volatile qualifier removal Remove the volatile qualifiers for the mutex backed implementation of apr_atomic. The mutex primitives act as memory barriers, synchronizing memory access with respect to other threads. The removal of the volatile qualifiers also speeds up, a little bit, non-threaded builds of APR.
Created attachment 20458 [details] builtins implementation volatile qualifier removal Remove the volatile qualifiers for the atomic builtins backed implementation of apr_atomic. These builtins, except __sync_lock_test_and_set, are considered a full barrier. Also use a full memory barrier on the read32/set32 functions to force strong memory ordering.
Created attachment 20459 [details] ia32 implementation volatile qualifier removal Remove volatile qualifiers and issue memory barriers where appropriate to prevent reordering.
Created attachment 20460 [details] ppc implementation volatile qualifier removal Remove volatile qualifiers and issue memory barriers where appropriate to prevent reordering.
Created attachment 20461 [details] solaris implementation volatile qualifier removal Remove volatile qualifiers and issue memory barriers where appropriate to prevent reordering. The solaris atomic man page says: "The membar_consumer() function arranges for all loads issued before this point in the code to be completed before any subsequent loads." and "The membar_producer() function arranges for all stores issued before this point in the code to reach global visibility before any stores that follow." TODO: the other function may still need explicit memory barriers.
Created attachment 20462 [details] win32 implementation volatile qualifier removal Remove volatile qualifiers. The WIN32 Interlocked* functions by default provide full memory barrier semantics.
Created attachment 20463 [details] netware implementation volatile qualifier removal Remove volatile qualifiers, and regarding to memory barriers, the Novell developer kit documentation "NDK: Libraries for C (LibC), Volume 2; Section I, General C Services; Atomic Functions", says: "An atomic function is guaranteed to complete its memory operation before any other processor or bus agent can access the same memory." So I presume that the atomic functions issues memory barriers.
Created attachment 20464 [details] os390 implementation volatile qualifier removal
Created attachment 20465 [details] remove volatile usage on the atomic test code
Created attachment 20466 [details] apr_atomic_xchgptr implementation As requested by Curt Arnold, for the log4cxx project, implement apr_atomic_xchgptr that atomically exchanges a pair of pointer values.
Created attachment 20468 [details] apr_atomic_xchgptr implementation (+ppc) I forgot to refresh the patch. Only the OS/390 stub is intentionally missing.
This stuff should really be on dev@apr now. Is removing the volatile qualifier a backwards-compat break? (see http://apr.apache.org/versioning.html)
(In reply to comment #20) > This stuff should really be on dev@apr now. Just finishing a few things up before posting (z/Architecture). > Is removing the volatile qualifier a backwards-compat break? (see > http://apr.apache.org/versioning.html) No. It's a ABI break, the patches are intended only for trunk (1.3.0+).
As pointed to me by Paul Querna, an ABI break between 1.2.x and 1.3.x is not permitted and since I can't guarantee that it won't break the ABI for th various supported compilers, I'll only move forward the barrier related patches. Volatile removal is left for later (2.+).