Apache OpenOffice (AOO) Bugzilla – Issue 47541
OpenOffice Interaction With GNOME Chooser on NLD Fails
Last modified: 2005-06-08 14:33:28 UTC
Working with Michael Meeks from Novell, we have been monitoring how the GNOME Chooser works from within OpenOffice on NLD (Novell Linux Desktop). He has requested we mark this one assigned to 'mmeeks' which I will attempt to do. To bring this report to date: -- We were using NLD Final beta and thought possibly that upgrade to NLD release might fix this issue. Upgraded to NLD release, and problem remains. -- Michael asked that we upgrade the NLD to SP1, which we did today and the that didn't fix it. -- Michael asked that we bring down one of the Novell milestone snaps that they have been building. Last available was M79 which I installed via red-carpet. This snap intalled and ran just fine. GNOME chooser worked fine. It should be noted that the GNOME Chooser in the regular Sun builds worked fine around that same timeframe. It only gotten really bad in the last 2-3 milestons. I'm going to start attaching requested files after posting this report. The two things that I am seeing immediately in M93 is: Do a File-->Open, GNOME chooser opens, hit [Cancel] and File-->Exit no longer works. Do a File-->Open, GNOME chooser opens, select a file and chooser closers, OOo flashes once and no document loads. There might be more issues, I can't get further to test them. It should be noted that the regular OOo file manager works just fine.
Created attachment 25106 [details] Results of rpm -q --changelog gtk2 | head as requested.
Created attachment 25108 [details] Shockwave file of File-->Open, Cancel, File-->Exit which does nothing (should exit program)
Created attachment 25109 [details] Shockwave file of File-->Open, select file, nothing loads into OpenOffice.
I tried to strace during the file manager. When I do strace ./swriter it doesn't display information while I'm doing File-->Open (new thread)? When I start up OpenOffice and then try and attach to the running thread, this is what I get: oa1:/opt/openoffice.org1.9.93/program # ps -ef | grep open root 4957 4883 0 13:09 pts/1 00:00:00 /bin/sh /opt/openoffice.org1.9.93/program/soffice -writer root 4967 4957 18 13:09 pts/1 00:00:02 /opt/openoffice.org1.9.93/program/soffice.bin -writer root 4976 4883 0 13:09 pts/1 00:00:00 grep open oa1:/opt/openoffice.org1.9.93/program # strace -p 4967 strace: out of memory Sorry, not an expert at the debugging tools. Let me know what else I can try, I would be happy to give it a try.
So - there is no easy way to debug this; and I can't reproduce it here. We're using a fairly patched build, particularly in this area - although, I'd imagine our fixes are not affecting this - unless it's some weird filter selection problem. I'm running on NLD/SP1 myself, naturally; Dave - my gtk+ is 2.4.13-2 - yours seems older; I wonder why that is & how that can be. Can you also specify your gnome-vfs2 version ? Caolan - any ideas ? - Ultimately this is working fine for both of us :-)
vaguely thrashing around there could be some whacked threading issue as well. Right now we're using glibc 2.3.5 which has disposed of linuxthreads, and all works well for us with that combo of OOo 1.9.92/glibc 2.3.5/gnome-vfs2 of 2.10.0 and gtk2 2.6.7
anything like accessibility enabled ?
oa1:~ # rpm -qa | grep vfs gnome-vfs2-devel-2.6.1-6.19 gnome-vfs2-2.6.1-6.20 oa1:~ # rpm -qa | grep gtk | sort gtk-1.2.10-881.2 gtk-engines-0.12-960.2 gtk-sharp-1.0.4-0.1 gtk2-2.4.9-0.4 gtk2-devel-2.4.9-0.4 gtk2-engines-2.2.0-395.3 gtk2-themes-0.1-634.2 gtkhtml2-3.2.4-0.1 gtksourceview-1.0.1-2.1 gtkspell-2.0.5-56.1 libgtkhtml-2.6.1-2.1 python-gtk-2.2.0-2.2 oa1:~ # rpm -qa | grep glibc glibc-2.3.3-98.38 glibc-locale-2.3.3-98.41 glibc-devel-2.3.3-98.38 glibc-html-2.3.3-98.41 glibc-i18ndata-2.3.3-98.41 oa1:~ # This is stock NLD installed cleanly with SP1 downloaded from red-carpet. Maybe at some point you brought down newer packages internally that aren't available to outside red-carpet users? This is a very clean installed, just a few days old. Michael I can get you logged into that server with Citrix Metaframe for Unix if that would help. We are trying to keep our two copies of NLD (one for Evolution, one for OpenOffice) 'clean' and supported and want to only download packages right from Novell. We agreed on NLD when working up the support contract for using OOo/Evolution/GroupWise. Let me know how I can help.
Accessibility is not enabled. I tried a few things: 1) Ran it from the console instead of remote display to thin clients, it's doing the same thing. 2) Changed theme to a few different selections to see if that mattered, it did not.
Dave - to confirm, this is not happening with the OO.o build you get in NLD though ? - which is also using the (same) gtk+ file selector. So - this really does sound like some up-stream snafu. I guess when bugs: i#46800# i#47163# get merged / into the mainline & we're building from the same tree, it should be easier to isolate the problem. The other thing is - does the ooo-645 package from the red-carpet machine work for you - the fpicker code in that is extremely similar to what shipped in m79; so ... Of course - there is one other bug that is the broken osl_Condition nonsense that up-stream will not fix until 2.0.1 - if you have a MT machine - it's entirely possible that you're being bitten by i#44627#; hard to say.
You might be onto something about multi-threading. The machine is a quad-3Ghz server. I disabled hyper-threading (which makes the machine obviously look like an 8way to Linux) and it still failed. This time however, I did get it to physically crash and report some information. Turning off hyper-threading was the first time I got output on the command line in this manner. I'm can't say for sure 100% this is the difference, but just reporting what I'm seeing. I'll attach the text that appeared on the command line when I attempted to access the GNOME file chooser. Yes, that older M79 build worked ok. I could select files and saw no crashes. Also, older builds from Sun worked OK as I mentioned above. They changed something in the last 2-3 milestones that caused this failure. If they are holding fixes for NLD until 2.0.1, we probably will just revert OOo back to the regular file manager which seems to work much better in these later milestones. Attaching output that went to the command line.
Created attachment 25182 [details] Command line output when clicking on GNOME file chooser.
> You might be onto something about multi-threading. The machine is a quad-3Ghz > server. I disabled hyper-threading (which makes the machine obviously look > like an 8way to Linux) and it still failed. This time however, I did get it > to physically crash and report some information. Can you attach gdb to OO.o and get a more detailed stack trace ? that'd perhaps be helpful. > If they are holding fixes for NLD until 2.0.1, we probably will just revert > OOo back to the regular file manager which seems to work much better in > these later milestones. Up-stream is just not committing a fix to osl_Condition [ which is that it can't be waited on with a timeout reliably ;-], it shouldn't cause a crasher, although I can well believe it causes your problem. Of course, that fix was in our m79 builds, and will be in our 2.0.0 final builds. They percieve the fix as risky / cf. the bug. Anyhow - if you want a build that works well on NLD, it's prolly best grabbing the latest (sourrce) release of ooo-build, and giving it a go. Failing that we'll have packages of an m92 release on red-carpet.go-oo.org in the next few days DV.
Could you do this: gconftool-2 --set /desktop/gnome/interface/file_chooser_backend --type=string gtk+ and see if the problem goes away? If it does, then it means something is wrong in the VFS backend or in gnome-vfs itself. To restore things to their original state, use this: gconftool-2 -u /desktop/gnome/interface/file_chooser_backend
Did as you requested, using the gconf command line. I issued the command, and OOo failed to work in the same manner. I saw no difference at all in the way it's working. If it's a VFS problem, it must be related to VFS on a MP server because you aren't seeing it. [Server is an HP, Quad-3.00Ghz, 8GB physical memory, 20GB swap space]. Maybe the NLD bigmem, MP kernel interacting with VFS which probably was not tested heavily during the beta period. I have been able to figure out that the command line spewage happens when you have the native OOo file manager selected, toggle back to using the GNOME file chooser and then click on File-->Open. It crashes the first time completely, and then each time after that it works as it did in the shockwave shots I did above. No crash, just does nothing. I'll try and get gdb on it during this type of crash.
Stupid question: did you restart OOo after doing "gconftool-2 --set ..."? And to double-check: before using that command, the File/Open dialog should display a list of available volumes in the left-hand pane (floppy, CDROM, etc.). After using that command, it should list no volumes other than "Filesystem". If the second thing happens properly (showing no volumes), then I have no idea of what could be causing the bug.
The conditions of my test were that OpenOffice was completely closed. I issued the command and started OOo and then checked the results. I'm not seeing any difference in the file chooser with and without the command issued. No "volumes" ever appear in the chooser. That possibly could be because I'm not logging into the console itself. I'm telneting to the server, setting my DISPLAY back to my thin client and then using remote display. I can get anyone interested in seeing it first hang logged in via Citrix Metaframe. M95 just hit the mirrors, testing that now.
M95 doesn't allow use of the GNOME file chooser at all, couldn't test. Unchecking the option to use the Openoffice.org dialog doesn't do anything. The default OOo file manager comes up either way. Possibly might have to submit this as another issue? gnome-integration package was installed correctly in M95. I'm going back to M93 and see if I can get gdb to report additional information.
Michael and Novell pushed their own build of M92 out, and they requested that I bring it down and test on our server. [[ NLD, all patches installed ]]. I clicked on the "Open" icon and it took a LONG time to open (felt like it was crashing), when I clicked on the Cancel button it crashed and bug buddy opened and I'm attaching the debugger information. So it seems like whatever is happening, is almost exactly the same in the Novell builds. It's got to be related to big memory or multiple CPUs because that is the only thing that is different than the workstations where it works correctly. Hyperthreading is out, I've turned that off for our testing. Attaching gdb (bug buddy) information.
Created attachment 25750 [details] Bug Buddy produced output after attempts to use the GTK file chooser.
From the stack trace it looks like you have an old libgnomeui. These are the SRPMS for the packages that will be in NLD SP2: http://primates.ximian.com/~federico/misc/gtk+/SRPMS/ I wonder if you could give them a try. In particular, they fix a bunch of threading problems.
Is there a way to get these built in a manner so I can just rpm the binary into NLD? A possible pre-release of SP2? I don't feel comfortable trying to build from source and trying to integrate those packages into that server. We are trying to keep that server "stock" as much as possible so that we can open support calls if required with Novell. If not, we can wait for SP2 I guess.
OK, I've uploaded built packages here: http://primates.ximian.com/~federico/misc/gtk+/
Ok, got the rpms (thanks!) and I'm going to schedule a reboot of that server, re-enable hyperthreads and install updated packages so it can be tested correctly.
Good. Please make sure you use gconftool-2 -u /desktop/gnome/interface/file_chooser_backend as described above to restore the GConf key to its original state. This will let you test the VFS backend, which is what we are interested in.
Ok, those rpm patches went right on top of SP1 just fine and installed: oa1:/users/drichard/federico # rpm -qa | grep vfs gnome-vfs2-devel-2.6.1-6.26 gnome-vfs2-2.6.1-6.26 oa1:/users/drichard/federico # rpm -qa | grep gtk2 scim-gtk2-immodule-1.0.1-1.7 gtk2-2.4.14-0.5 gtk2-engines-2.2.0-395.3 gtk2-devel-2.4.14-0.5 gtk2-themes-0.1-634.2 oa1:/users/drichard/federico # rpm -qa | grep gnomeui libgnomeui-2.6.2-0.1 libgnomeui-devel-2.6.2-0.1 oa1:/users/drichard/federico # I then ran M100 and enabled the GNOME file chooser and it's working in exactly the same manner. Here is what it's doing: - It navigates correctly throught the file systems, including NFS drives. I then select to Open and the GUI closes and the OOo panel never gets the file. It's empty and still "Untitled" - If I open OOo and then type in some text and hit the Save button and type in a name and attempt to save it, the GUI closes and nothing happens. The title bar doesn't change to reflect the file name. - If I hit File->Open and then [Cancel] the OOo main gui no longer accepts File->close to exit the program. Sorry :\
So - I'm guessing this is a combination of threading woes & a (now fixed) libgnomeui threading problem - that Federico has fixed. I got Dave to upgrade to our m100 builds; and built a debug enabled library for him. Unfortunately the non-debugging m100 of ours works - where (apparently) the m92 one didn't. [ no changes in ooo-build in this area between them, and nothing relevant up-stream ]. However the up-stream / Sun m100 doesn't work. So - this has to be either a deep race foo - perhaps helped in our build by our fix for i#44627# without a working 'condition' life is rather difficult. Looking for such a thing reveals that the asynceventnotifier.cxx is full of threading / condition foo and that 'm_bRun' is not correctly guarded by a mutex. The comment in the startup method there scares me too: // m_bRun may already be false because of a // call to stop but the thread did not yet // terminate so m_hEventNotifierThread is // yet a valid thread handle that should // not be overwritten if (!m_bRun) Are we really expecting to re-use this thread multiple times & start/shutdown it in a potentially interleaved fashion: I hope not, the code doesn't look like it can cope with that. Anyhow - it seems there is an obvious race between the m_aNotifyCondition.wait() and the m_aNotifyCondition.reset() that is unnecessary - this patch addresses that and fiddles with the m_aExitCondition.reset() in a rather futile way to remove the sense of another race - it should really be guarded by a rather tedious lock ;-) Finally - I forget quite why we needed the async thread to do this; I thought we wanted to post the events to be emitted from the Application:: main loop, but perhaps I'm mistaken. Caolan ? --- fpicker/source/unx/gnome/asynceventnotifier.cxx 18 Jan 2005 13:25:45 -0000 1.3 +++ fpicker/source/unx/gnome/asynceventnotifier.cxx 6 May 2005 14:15:27 -0000 @@ -141,9 +141,9 @@ void SAL_CALL SalGtkAsyncEventNotifier:: osl::ResettableMutexGuard aGuard(m_aMutex); OSL_PRECOND(m_bRun,"Event notifier does not run!"); - m_bRun = false; m_aExitCondition.reset(); + m_bRun = false; m_aNotifyCondition.set(); @@ -210,6 +211,7 @@ void SAL_CALL SalGtkAsyncEventNotifier:: while (m_bRun) { m_aNotifyCondition.wait(); + m_aNotifyCondition.reset(); while (getEventListSize() > 0) { @@ -238,7 +240,6 @@ void SAL_CALL SalGtkAsyncEventNotifier:: } } } // while(getEventListSize() > 0) - m_aNotifyCondition.reset(); } // while(m_bRun) m_aExitCondition.set(); }
FYI- Tested Sun M104 and still fails, cannot use GTK File Chooser on multi-CPU NLD. Will have to wait for those patches to make their way into 2.0.1
drichard: Can you try the m106 openoffice.org style rpms at http://ooomisc.services.openoffice.org/pub/OpenOffice.org/cws/upload/fpicker4/ and http://ooomisc.services.openoffice.org/pub/OpenOffice.org/cws/upload/fpicker4.alternative/ and let me know how you get on with them. mmeeks: I don't think we need any of the complex threading foo, and I've ditched it in the above two installsets and rejigged things a little. fpicker4 is with #i44627# patch, fpicker4.alternative is without it. What I'd like to see as the result is for fpicker4 to work, and fpicker4.alternative to show the problematic behaviour.
fpicker4 looks great in this regard - holding the solar mutex seems far more sensible :-)
Michael pointed me to this updated issue. I didn't get the email or I would have done it immediately for you, sorry. I brought down both builds, and installed on that multi-processor system with bigmem kernel. fpicker4 works perfectly. It 'feels' like the GNOME file picker is part of the gui now. In previous versions it would always kind of be sluggish in interaction with OOo and felt like it was launching an external application. And as was reported, files wouldn't always work, and menu options would disable and stop working after using the picker. Dialogs snap open, and work much faster with fpicker4. I installed fpicker4.alternative and tried it as well. It worked like the older milestones: Sluggish, felt like it was going to crash, disabled pulldowns, sometimes would load files, sometimes would not load them. From our perspective this is the best GTK picker milestone yet, and if the code doesn't mess up others, would love for it to be merged into the build. I'm leaving the bug open per your discretion. fpicker4 to me would close this issue.
fpicker4 has been moved to the test server and about 20 people will be using it full time next week. I'll report any problems.
Grand. The only difference between the two installsets is #i44627#. So closing as a duplicate of that. The broken "alternative" fpicker4 is without the 44627 fix. The seperate code re-works and other cleanups are done in fpicker4 under the issues registered for that workspace. *** This issue has been marked as a duplicate of 44627 ***
.