Apache OpenOffice (AOO) Bugzilla – Issue 108371
cppu: preserve 128-bit stack alignment on x86 in cpp_uno/gcc3_linux_intel
Last modified: 2010-03-10 09:06:47 UTC
The gcc3 linux-intel-x86 ABI expects stack frames to be aligned to 128-bit (16-byte) boundaries. This is to allow optimised code to use SSE and SSE2 instructions with 128-bit operands, which segfault when used on (stack) memory operands violating 128-bit alignment. This is seen at http://bugs.gentoo.org/show_bug.cgi?id=292519 as a segfault during installation, when executing build/ooo310-m19/solver/310/unxlngi6.pro/bin/regcomp. The optimised code where the segfault occurs is in python internals, in PyString_Format. The issue is that the asm within cpp_uno/gcc3_linux_intel does not preserve 128-bit stack alignment. gcc3 assumes 128-bit alignment and emits code that preserves that alignment, but requires that asm also act to preserve alignment. I attach a patch to preserve alignment in cpp_uno/gcc3_linux_intel.
Created attachment 67229 [details] cpp_uno-align.diff
extremely interesting
@ecatmur: Do you happen to have a link to a relevant ABI specification? <http://ldn.linuxfoundation.org/lsb/lsb4-resource-page> seems to still only mandate the classical Sys V 32-bit alignment.
kr->sb: please take this over.
@sb: I think I'm remembering an assertion by Jakub Jelinek at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40838#c1 > ------- Comment #1 From Jakub Jelinek 2009-07-23 12:56 [reply] ------- > Linux/ix86 ABI says that stack must be 16 byte aligned. So GCC can rely on it. It seems rather contentious as to what the Linux x86 ABI is or if there even is one (there is a reference to a SYSV ABI, which I've never seen). Looking at that bug, I'm not really sure what gcc is doing about this issue; I don't understand H. J. Lu's comments or patches. In any case, any fixes will be to gcc 4.5 or later and won't make a difference to current distros' compilers.
@ecatmur: Thanks for the pointer. Not sure if the apparent non-existence of a Linux x86 ABI should make me laugh or cry. Anyway, as it reportedly helps, we can integrate the patch. Just a few questions: From the placement of the esp adjustments, I gather that the requirement is that esp is 16 byte aligned just prior to a call (i.e., the return address always goes to an 0x...C address); right? Then, in call.s, privateSnippetExecutorVoid, doing 5*pushl = 20 byte, plus 4 byte return address, would need the esp adjustment, too, right? Also, in all the other adjustments in call.s, instead of doing something like subl $0x4,%esp andl $0xfffffff0,%esp one could instead exploit the knowledge that upon entry to the function esp = 0x...C and do subl $0x8,%esp instead (for an original subtract of $0x4, and a total of 5 pushl) to ensrure that afterwards exp = 0x...0. Right? (Which would save an instruction, and would waste less stack space on average for environments that only do 4 byte alignment).
I incorporated my <#desc7> thoughts into the patch and applied it as <http://hg.services.openoffice.org/cws/sb119/rev/35c4f2290244>. Let me know if my thinking is correct and this also works for you.
.