Issue 47541 - OpenOffice Interaction With GNOME Chooser on NLD Fails
Summary: OpenOffice Interaction With GNOME Chooser on NLD Fails
Status: CLOSED DUPLICATE of issue 44627
Alias: None
Product: General
Classification: Code
Component: code (show other issues)
Version: 680m93
Hardware: All All
: P3 Trivial (vote)
Target Milestone: ---
Assignee: caolanm
QA Contact: issues@framework
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-04-15 17:40 UTC by drichard
Modified: 2005-06-08 14:33 UTC (History)
3 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
Results of rpm -q --changelog gtk2 | head as requested. (270 bytes, text/plain)
2005-04-15 17:43 UTC, drichard
no flags Details
Shockwave file of File-->Open, Cancel, File-->Exit which does nothing (should exit program) (254.50 KB, application/x-shockwave-flash)
2005-04-15 17:56 UTC, drichard
no flags Details
Shockwave file of File-->Open, select file, nothing loads into OpenOffice. (284.33 KB, application/x-shockwave-flash)
2005-04-15 18:01 UTC, drichard
no flags Details
Command line output when clicking on GNOME file chooser. (3.37 KB, text/plain)
2005-04-18 19:15 UTC, drichard
no flags Details
Bug Buddy produced output after attempts to use the GTK file chooser. (46.39 KB, text/plain)
2005-05-03 15:05 UTC, drichard
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description drichard 2005-04-15 17:40:07 UTC
Working with Michael Meeks from Novell, we have been monitoring how the GNOME
Chooser works from within OpenOffice on NLD (Novell Linux Desktop).  He has
requested we mark this one assigned to 'mmeeks' which I will attempt to do.

To bring this report to date:

-- We were using NLD Final beta and thought possibly that upgrade to NLD release
might fix this issue.  Upgraded to NLD release, and problem remains.

-- Michael asked that we upgrade the NLD to SP1, which we did today and the that
didn't fix it.

-- Michael asked that we bring down one of the Novell milestone snaps that they
have been building.  Last available was M79 which I installed via red-carpet. 
This snap intalled and ran just fine. GNOME chooser worked fine.  It should be
noted that the GNOME Chooser in the regular Sun builds worked fine around that
same timeframe.  It only gotten really bad in the last 2-3 milestons.

I'm going to start attaching requested files after posting this report.  The two
things that I am seeing immediately in M93 is:

Do a File-->Open, GNOME chooser opens, hit [Cancel] and File-->Exit no longer works.

Do a File-->Open, GNOME chooser opens, select a file and chooser closers, OOo
flashes once and no document loads.

There might be more issues, I can't get further to test them.  It should be
noted that the regular OOo file manager works just fine.
Comment 1 drichard 2005-04-15 17:43:25 UTC
Created attachment 25106 [details]
Results of rpm -q --changelog gtk2 | head as requested.
Comment 2 drichard 2005-04-15 17:56:17 UTC
Created attachment 25108 [details]
Shockwave file of File-->Open, Cancel, File-->Exit which does nothing (should exit program)
Comment 3 drichard 2005-04-15 18:01:07 UTC
Created attachment 25109 [details]
Shockwave file of File-->Open, select file, nothing loads into OpenOffice.
Comment 4 drichard 2005-04-15 18:12:28 UTC
I tried to strace during the file manager.  When I do

strace ./swriter it doesn't display information while I'm doing File-->Open (new
thread)?

When I start up OpenOffice and then try and attach to the running thread, this
is what I get:

oa1:/opt/openoffice.org1.9.93/program # ps -ef | grep open
root      4957  4883  0 13:09 pts/1    00:00:00 /bin/sh
/opt/openoffice.org1.9.93/program/soffice -writer
root      4967  4957 18 13:09 pts/1    00:00:02
/opt/openoffice.org1.9.93/program/soffice.bin -writer
root      4976  4883  0 13:09 pts/1    00:00:00 grep open
oa1:/opt/openoffice.org1.9.93/program # strace -p 4967
strace: out of memory

Sorry, not an expert at the debugging tools.  Let me know what else I can try, I
would be happy to give it a try.
Comment 5 mmeeks 2005-04-18 13:47:57 UTC
So - there is no easy way to debug this; and I can't reproduce it here.
We're using a fairly patched build, particularly in this area - although, I'd
imagine our fixes are not affecting this - unless it's some weird filter
selection problem.

I'm running on NLD/SP1 myself, naturally;

Dave - my gtk+ is 2.4.13-2 - yours seems older; I wonder why that is & how that
can be. Can you also specify your gnome-vfs2 version ?

Caolan - any ideas ? - Ultimately this is working fine for both of us :-)
Comment 6 caolanm 2005-04-18 13:52:28 UTC
vaguely thrashing around there could be some whacked threading issue as well.
Right now we're using glibc 2.3.5 which has disposed of linuxthreads, and all
works well for us with that combo of OOo 1.9.92/glibc 2.3.5/gnome-vfs2 of 2.10.0
and gtk2 2.6.7
Comment 7 caolanm 2005-04-18 14:06:50 UTC
anything like accessibility enabled ? 
Comment 8 drichard 2005-04-18 14:19:33 UTC
oa1:~ # rpm -qa | grep vfs
gnome-vfs2-devel-2.6.1-6.19
gnome-vfs2-2.6.1-6.20

oa1:~ # rpm -qa | grep gtk | sort
gtk-1.2.10-881.2
gtk-engines-0.12-960.2
gtk-sharp-1.0.4-0.1
gtk2-2.4.9-0.4
gtk2-devel-2.4.9-0.4
gtk2-engines-2.2.0-395.3
gtk2-themes-0.1-634.2
gtkhtml2-3.2.4-0.1
gtksourceview-1.0.1-2.1
gtkspell-2.0.5-56.1
libgtkhtml-2.6.1-2.1
python-gtk-2.2.0-2.2
oa1:~ # rpm -qa | grep glibc
glibc-2.3.3-98.38
glibc-locale-2.3.3-98.41
glibc-devel-2.3.3-98.38
glibc-html-2.3.3-98.41
glibc-i18ndata-2.3.3-98.41
oa1:~ # 

This is stock NLD installed cleanly with SP1 downloaded from red-carpet.  Maybe
at some point you brought down newer packages internally that aren't available
to outside red-carpet users?  This is a very clean installed, just a few days
old.   Michael I can get you logged into that server with Citrix Metaframe for
Unix if that would help.  

We are trying to keep our two copies of NLD (one for Evolution, one for
OpenOffice) 'clean' and supported and want to only download packages right from
Novell.  We agreed on NLD when working up the support contract for using
OOo/Evolution/GroupWise.

Let me know how I can help.
Comment 9 drichard 2005-04-18 14:52:30 UTC
Accessibility is not enabled.   

I tried a few things:

1) Ran it from the console instead of remote display to thin clients, it's doing
the same thing.

2) Changed theme to a few different selections to see if that mattered, it did not.
Comment 10 mmeeks 2005-04-18 17:11:35 UTC
Dave - to confirm, this is not happening with the OO.o build you get in NLD
though ? - which is also using the (same) gtk+ file selector.

So - this really does sound like some up-stream snafu. I guess when bugs:
i#46800# i#47163# get merged / into the mainline & we're building from the same
tree, it should be easier to isolate the problem.

The other thing is - does the ooo-645 package from the red-carpet machine work
for you - the fpicker code in that is extremely similar to what shipped in m79;
so ...

Of course - there is one other bug that is the broken osl_Condition nonsense
that up-stream will not fix until 2.0.1 - if you have a MT machine - it's
entirely possible that you're being bitten by i#44627#; hard to say.
Comment 11 drichard 2005-04-18 19:14:17 UTC
You might be onto something about multi-threading.  The machine is a quad-3Ghz
server.  I disabled hyper-threading (which makes the machine obviously look like
an 8way to Linux) and it still failed.  This time however, I did get it to
physically crash and report some information.  Turning off hyper-threading was
the first time I got output on the command line in this manner.  I'm can't say
for sure 100% this is the difference, but just reporting what I'm seeing.  I'll
attach the text that appeared on the command line when I attempted to access the
GNOME file chooser.

Yes, that older M79 build worked ok.  I could select files and saw no crashes. 
Also, older builds from Sun worked OK as I mentioned above.  They changed
something in the last 2-3 milestones that caused this failure.

If they are holding fixes for NLD until 2.0.1, we probably will just revert OOo
back to the regular file manager which seems to work much better in these later
milestones.  

Attaching output that went to the command line.
Comment 12 drichard 2005-04-18 19:15:13 UTC
Created attachment 25182 [details]
Command line output when clicking on GNOME file chooser.
Comment 13 mmeeks 2005-04-19 10:25:02 UTC
> You might be onto something about multi-threading.  The machine is a quad-3Ghz
> server.  I disabled hyper-threading (which makes the machine obviously look 
> like an 8way to Linux) and it still failed.  This time however, I did get it 
> to physically crash and report some information.

   Can you attach gdb to OO.o and get a more detailed stack trace ? that'd
perhaps be helpful.

> If they are holding fixes for NLD until 2.0.1, we probably will just revert
> OOo back to the regular file manager which seems to work much better in
> these later milestones.  

   Up-stream is just not committing a fix to osl_Condition [ which is that it
can't be waited on with a timeout reliably ;-], it shouldn't cause a crasher,
although I can well believe it causes your problem. Of course, that fix was in
our m79 builds, and will be in our 2.0.0 final builds. They percieve the fix as
risky / cf. the bug.

   Anyhow - if you want a build that works well on NLD, it's prolly best
grabbing the latest (sourrce) release of ooo-build, and giving it a go. Failing
that we'll have packages of an m92 release on red-carpet.go-oo.org in the next
few days DV.
Comment 14 federicomena 2005-04-20 03:24:24 UTC
Could you do this:

gconftool-2 --set /desktop/gnome/interface/file_chooser_backend --type=string gtk+

and see if the problem goes away?  If it does, then it means something is wrong
in the VFS backend or in gnome-vfs itself.

To restore things to their original state, use this:

gconftool-2 -u /desktop/gnome/interface/file_chooser_backend
Comment 15 drichard 2005-04-20 13:49:08 UTC
Did as you requested, using the gconf command line.  I issued the command, and
OOo failed to work in the same manner.  I saw no difference at all in the way
it's working.

If it's a VFS problem, it must be related to VFS on a MP server because you
aren't seeing it. [Server is an HP, Quad-3.00Ghz, 8GB physical memory, 20GB swap
space].  Maybe the NLD bigmem, MP kernel interacting with VFS which probably was
not tested heavily during the beta period.

I have been able to figure out that the command line spewage happens when you
have the native OOo file manager selected, toggle back to using the GNOME file
chooser and then click on File-->Open. It crashes the first time completely, and
then each time after that it works as it did in the shockwave shots I did above.
 No crash, just does nothing.  I'll try and get gdb on it during this type of crash.
Comment 16 federicomena 2005-04-20 17:11:38 UTC
Stupid question:  did you restart OOo after doing "gconftool-2 --set ..."?

And to double-check:  before using that command, the File/Open dialog should
display a list of available volumes in the left-hand pane (floppy, CDROM, etc.).
 After using that command, it should list no volumes other than "Filesystem".

If the second thing happens properly (showing no volumes), then I have no idea
of what could be causing the bug.
Comment 17 drichard 2005-04-21 15:00:57 UTC
The conditions of my test were that OpenOffice was completely closed.  I issued
the command and started OOo and then checked the results.

I'm not seeing any difference in the file chooser with and without the command
issued.  No "volumes" ever appear in the chooser.  That possibly could be
because I'm not logging into the console itself.  I'm telneting to the server,
setting my DISPLAY back to my thin client and then using remote display.

I can get anyone interested in seeing it first hang logged in via Citrix Metaframe.

M95 just hit the mirrors, testing that now.
Comment 18 drichard 2005-04-21 17:24:15 UTC
M95 doesn't allow use of the GNOME file chooser at all, couldn't test. 
Unchecking the option to use the Openoffice.org dialog doesn't do anything.  The
default OOo file manager comes up either way.  Possibly might have to submit
this as another issue?  gnome-integration package was installed correctly in M95.

I'm going back to M93 and see if I can get gdb to report additional information.
Comment 19 drichard 2005-05-03 15:04:22 UTC
Michael and Novell pushed their own build of M92 out, and they requested that I
bring it down and test on our server.  [[ NLD, all patches installed ]].

I clicked on the "Open" icon and it took a LONG time to open (felt like it was
crashing), when I clicked on the Cancel button it crashed and bug buddy opened
and I'm attaching the debugger information.  So it seems like whatever is
happening, is almost exactly the same in the Novell builds.  It's got to be
related to big memory or multiple CPUs because that is the only thing that is
different than the workstations where it works correctly.  Hyperthreading is
out, I've turned that off for our testing.

Attaching gdb (bug buddy) information.
Comment 20 drichard 2005-05-03 15:05:26 UTC
Created attachment 25750 [details]
Bug Buddy produced output after attempts to use the GTK file chooser.
Comment 21 federicomena 2005-05-03 16:45:58 UTC
From the stack trace it looks like you have an old libgnomeui.

These are the SRPMS for the packages that will be in NLD SP2:
http://primates.ximian.com/~federico/misc/gtk+/SRPMS/

I wonder if you could give them a try.  In particular, they fix a bunch of
threading problems.
Comment 22 drichard 2005-05-03 17:28:47 UTC
Is there a way to get these built in a manner so I can just rpm the binary into
NLD?   A possible pre-release of SP2?  I don't feel comfortable trying to build
from source and trying to integrate those packages into that server.  We are
trying to keep that server "stock" as much as possible so that we can open
support calls if required with Novell.

If not, we can wait for SP2 I guess.
Comment 23 federicomena 2005-05-04 02:29:03 UTC
OK, I've uploaded built packages here:

http://primates.ximian.com/~federico/misc/gtk+/
Comment 24 drichard 2005-05-05 19:00:46 UTC
Ok, got the rpms (thanks!) and I'm going to schedule a reboot of that server,
re-enable hyperthreads and install updated packages so it can be tested correctly.
Comment 25 federicomena 2005-05-05 19:35:37 UTC
Good.  Please make sure you use

gconftool-2 -u /desktop/gnome/interface/file_chooser_backend

as described above to restore the GConf key to its original state.  This will
let you test the VFS backend, which is what we are interested in.
Comment 26 drichard 2005-05-05 22:09:08 UTC
Ok, those rpm patches went right on top of SP1 just fine and installed:

oa1:/users/drichard/federico # rpm -qa | grep vfs
gnome-vfs2-devel-2.6.1-6.26
gnome-vfs2-2.6.1-6.26
oa1:/users/drichard/federico # rpm -qa | grep gtk2
scim-gtk2-immodule-1.0.1-1.7
gtk2-2.4.14-0.5
gtk2-engines-2.2.0-395.3
gtk2-devel-2.4.14-0.5
gtk2-themes-0.1-634.2
oa1:/users/drichard/federico # rpm -qa | grep gnomeui
libgnomeui-2.6.2-0.1
libgnomeui-devel-2.6.2-0.1
oa1:/users/drichard/federico # 

I then ran M100 and enabled the GNOME file chooser and it's working in exactly
the same manner.  

Here is what it's doing:

- It navigates correctly throught the file systems, including NFS drives.  I
then select to Open and the GUI closes and the OOo panel never gets the file. 
It's empty and still "Untitled"

- If I open OOo and then type in some text and hit the Save button and type in a
name and attempt to save it, the GUI closes and nothing happens.  The title bar
doesn't change to reflect the file name.

- If I hit File->Open and then [Cancel] the OOo main gui no longer accepts
File->close to exit the program.

Sorry :\
Comment 27 mmeeks 2005-05-06 15:19:44 UTC
So - I'm guessing this is a combination of threading woes & a (now fixed)
libgnomeui threading problem - that Federico has fixed.

I got Dave to upgrade to our m100 builds; and built a debug enabled library for
him. Unfortunately the non-debugging m100 of ours works - where (apparently) the
m92 one didn't. [ no changes in ooo-build in this area between them, and nothing
relevant up-stream ]. However the up-stream / Sun m100 doesn't work.

So - this has to be either a deep race foo - perhaps helped in our build by our
fix for i#44627# without a working 'condition' life is rather difficult.

Looking for such a thing reveals that the asynceventnotifier.cxx is full of
threading / condition foo and that 'm_bRun' is not correctly guarded by a mutex.
 
The comment in the startup method there scares me too:

	// m_bRun may already be false because of a
	// call to stop but the thread did not yet
	// terminate so m_hEventNotifierThread is
	// yet a valid thread handle that should 
	// not be overwritten
	if (!m_bRun)

Are we really expecting to re-use this thread multiple times & start/shutdown it
in a potentially interleaved fashion: I hope not, the code doesn't look like it
can cope with that.

Anyhow - it seems there is an obvious race between the m_aNotifyCondition.wait()
and the m_aNotifyCondition.reset() that is unnecessary - this patch addresses
that and fiddles with the m_aExitCondition.reset() in a rather futile way to
remove the sense of another race - it should really be guarded by a rather
tedious lock ;-)

Finally - I forget quite why we needed the async thread to do this; I thought we
wanted to post the events to be emitted from the Application:: main loop, but
perhaps I'm mistaken.

Caolan ?

--- fpicker/source/unx/gnome/asynceventnotifier.cxx	18 Jan 2005 13:25:45 -0000	1.3
+++ fpicker/source/unx/gnome/asynceventnotifier.cxx	6 May 2005 14:15:27 -0000
@@ -141,9 +141,9 @@ void SAL_CALL SalGtkAsyncEventNotifier::
 	osl::ResettableMutexGuard aGuard(m_aMutex);
 	
 	OSL_PRECOND(m_bRun,"Event notifier does not run!");
 
-	m_bRun = false;
 	m_aExitCondition.reset();
+	m_bRun = false;
 
 	m_aNotifyCondition.set();
 	
@@ -210,6 +211,7 @@ void SAL_CALL SalGtkAsyncEventNotifier::
 	while (m_bRun)
 	{       
 		m_aNotifyCondition.wait();
+		m_aNotifyCondition.reset();
 
 		while (getEventListSize() > 0)
 		{
@@ -238,7 +240,6 @@ void SAL_CALL SalGtkAsyncEventNotifier::
 				}
 			}
 		} // while(getEventListSize() > 0)
-		m_aNotifyCondition.reset();
 	} // while(m_bRun)
 	m_aExitCondition.set();
 }
Comment 28 drichard 2005-05-18 15:04:41 UTC
FYI- Tested Sun M104 and still fails, cannot use GTK File Chooser on multi-CPU
NLD.  Will have to wait for those patches to make their way into 2.0.1
Comment 29 caolanm 2005-06-01 20:24:19 UTC
drichard: Can you try the m106 openoffice.org style rpms at 
http://ooomisc.services.openoffice.org/pub/OpenOffice.org/cws/upload/fpicker4/
and
http://ooomisc.services.openoffice.org/pub/OpenOffice.org/cws/upload/fpicker4.alternative/
and let me know how you get on with them.

mmeeks: I don't think we need any of the complex threading foo, and I've ditched
it in the above two installsets and rejigged things a little. 

fpicker4 is with #i44627# patch, fpicker4.alternative is without it. What I'd
like to see as the result is for fpicker4 to work, and fpicker4.alternative to
show the problematic behaviour.
Comment 30 mmeeks 2005-06-03 17:39:12 UTC
fpicker4 looks great in this regard - holding the solar mutex seems far more
sensible :-)
Comment 31 drichard 2005-06-03 19:30:50 UTC
Michael pointed me to this updated issue.  I didn't get the email or I would
have done it immediately for you, sorry.

I brought down both builds, and installed on that multi-processor system with
bigmem kernel.  fpicker4 works perfectly.  It 'feels' like the GNOME file picker
is part of the gui now.  In previous versions it would always kind of be
sluggish in interaction with OOo and felt like it was launching an external
application.  And as was reported, files wouldn't always work, and menu options
would disable and stop working after using the picker.  Dialogs snap open, and
work much faster with fpicker4.

I installed fpicker4.alternative and tried it as well.  It worked like the older
milestones:  Sluggish, felt like it was going to crash, disabled pulldowns,
sometimes would load files, sometimes would not load them.

From our perspective this is the best GTK picker milestone yet, and if the code
doesn't mess up others, would love for it to be merged into the build.

I'm leaving the bug open per your discretion.  fpicker4 to me would close this
issue.
Comment 32 drichard 2005-06-03 20:56:23 UTC
fpicker4 has been moved to the test server and about 20 people will be using it
full time next week. I'll report any problems.
Comment 33 caolanm 2005-06-08 14:32:25 UTC
Grand. The only difference between the two installsets is #i44627#. So closing
as a duplicate of that. The broken "alternative" fpicker4 is without the 44627
fix. The seperate code re-works and other cleanups are done in fpicker4 under
the issues registered for that workspace.

*** This issue has been marked as a duplicate of 44627 ***
Comment 34 caolanm 2005-06-08 14:33:28 UTC
.