Apache OpenOffice (AOO) Bugzilla – Issue 65788
bus error in udkapi
Last modified: 2013-02-07 21:55:12 UTC
m165 builds OK but this bus error when building m166 on GNU/Linux SPARC with gcc4.1.1: ============= Building project udkapi ============= /home/jim/m165/udkapi/com/sun/star/uno mkout -- version: 1.6 /home/jim/m165/udkapi/com/sun/star/unodmake: Executing shell macro: +echo $(IDLPACKAGE) | $(SED) 's/\\/\//g' idlc @/tmp/mkXMx5Jz idlc: compile 'Exception.idl' ... idlc: compile 'NamingService.idl' ... idlc: compile 'RuntimeException.idl' ... idlc: compile 'SecurityException.idl' ... idlc: compile 'DeploymentException.idl' ... idlc: compile 'TypeClass.idl' ... Bus error dmake: Error code 138, while making '../../../../unxlngs.pro/misc/urd_cssuno.don' '---* tg_merge.mk *---' ERROR: Error 65280 occurred while making /home/jim/m165/udkapi/com/sun/star/uno jim@sun:~/m165/udkapi$ If the problem file is skipped the same error occurs again later with other files. If i revert only module sal to m165 the problem is fixed. I suppose this relates to cws_src680_mhu12
Created attachment 36728 [details] some debug info
Created attachment 36734 [details] output from typesconfig
bus error on sparc is a hint of memory alignment problems. Browsing the changes reveals alignment code in alloc_impl.h and so this: jim@sun:~/m165/sal$ grep -r SAL_TYPES_ALIGNMENT8 * inc/sal/types.h: #define SAL_TYPES_ALIGNMENT8 1 rtl/source/alloc_impl.h:#if SAL_TYPES_ALIGNMENT8 > 1 rtl/source/alloc_impl.h:#define RTL_MEMORY_ALIGNMENT_8 SAL_TYPES_ALIGNMENT8 rtl/source/alloc_impl.h:#endif /* SAL_TYPES_ALIGNMENT8 */ unxlngs.pro/inc/sal/typesizes.h:#define SAL_TYPES_ALIGNMENT8 4 This last one looks odd, if I change 4 to be 8, touch rtl/source/alloc_impl.h and build again, then udkapi builds without errors. So it looks like the function GetAlignment or Description_Ctor in sal/typesconfig/typesconfig.c does not return a correct value?
mhu: please comment
mhu->sparcmoz: According to your description, it looks like some 8 byte (or larger) type cannot cope with a 4 byte alignment (resulting in SIGBUS). As the 'typesconfig' program does test with 'double', which can be 4 byte aligned (same as on Solaris Sparc 32bit), it would be interesting to find out what type exactly is causing the SIGBUS (and possibly add such a test to the 'typesconfig' program). And yes, this SAL_TYPES_ALIGNMENT8 = 4 works on Solaris Sparc 32bit; only in 64bit mode we have a SAL_TYPES_ALIGNMENT8 = 8, here. So, can you please try to find out which type exactly is causing the SIGBUS here? Thanks, Matthias
sparcmoz-->mhu: I tried deleting various types from com/sun/star/uno/TypeClass.idl but it is not that easy - there is some kind of interaction between the different types so that some combination of types is involved, and that makes a large number of trials, so I need a better strategy... I am attaching a patch, which I used to build and run m170, but this is NOT suggested as a fix, it is just filed here for ease of finding later...
Created attachment 36794 [details] to build m170 - proof of concept only - not a solution
sparcmoz->mhu: some more random bits of information. In case of com/sun/star/uno/TypeClass.idl I can remove the bus error by compiling that file before the others in com/sun/star/uno, by simply changing the sequence in makefile.mk Something similar can be seen in com/sun/star/lang, but when some files are moved to compile sooner, then the bus error comes at a different file which had previously not had a bus error. I attach a diff that shows the files that have built OK after moving up in the compile sequence. Within TypeClass.idl the bus error may be overcome by deleting for example all types after ARRAY, but I cannot narrow it down to any single type or group in that file. If each of the types within TypeClass.idl is placed alone in a file, then there is no bus error. If i run idlc directly on any file having bus error, from the command line, I get idlc: returned successful I have now completely built and run m170 with SAL_TYPES_ALIGNMENT8 = 8. As I am using gcc4.1.1 I have to use cws_src680_warnings01 for bridges.
Created attachment 36829 [details] files moved to avoid bus error
I did some tracing in idlc and have the following observations (a) Bus Error occurs sometimes, but only when trying to execute this command in idlc/source/idlcproduce.cxx // produce registry file if ( !idlc()->getRoot()->dump(rootKey) ) Further testing shows that idlc()->getRoot() is OK but an error occurs in dump(rootKey). (b) dump(rootKey) is implemented in idlc/source/astdeclaration.cxx at row 172 function sal_Bool AstDeclaration::dump(RegistryKey& rKey) (c) This function is recursive by including the following statement in a while loop: bRet = pDecl->dump(rKey); (d) Tryin g to understand what dump does, it appears the rKey has a variable number of members and the getRoot checks if the last member is included in a list of known types such as NT_module. (e) With normal operation the function is re-entered 5 times, the first 4 times finding type NT_module and the 5th time finding a different type. After the 5th re-entry then the while loop is completed and the dump function returns 5 times. (f) In failure operation the function enters 5 times and identifies the 5th type but never returns, as the Bus Error occurs at that point. (g) The first Bus Error is noted with type NT_enum, and if that is bypassed as described in earlier comments, then the error occurs next with NT_struct. I will attach a log of running udkapi and the patches to idlc that print out that log. From reading about recursion it appears a useful test would be to implement dump with some kind of loop so it is not recursive but I have no idea yet how to do that. I have not figured how to run this with gdb yet, I can see that idlccpp is called by execv from sal. This is very slow work for me, and as it works with alignment 8, I wonder if it is worth any more work at all?
Created attachment 37060 [details] Bus Error build log
Created attachment 37061 [details] patches for tracing dump and produce build.log
This error is hidden if module sal is rebuilt with the environment set with ALLOC=SYS_ALLOC, for example by configure --with-alloc=system. In that case it does not matter for udkapi, if ALLOC is set or not. So I guess the problem might be found in the sal code where ALLOC is not set. I like to try and build everything without-system, but not sure if that should be done in this case.
todo: investigate typesconfig patch from IA64 porting issue 84999
first step is testing - possible fix already integrated from issue 86955.