25506 – XCloseable::close() (Java) hang the Java application in certain situations.

Issue 25506 - XCloseable::close() (Java) hang the Java application in certain situations.

Summary: XCloseable::close() (Java) hang the Java application in certain situations.

Status:	CLOSED NOT_AN_OOO_ISSUE

Alias:	None

Product:	App Dev
Classification:	Unclassified
Component:	api (show other issues)
Version:	3.3.0 or older (OOo)
Hardware:	All Windows XP

Importance:	P3 Trivial
Target Milestone:	---
Assignee:	Giuseppe Castagno (aka beppec56)
QA Contact:	issues@api

URL:
Keywords:	oooqa

Depends on:
Blocks:

Reported:	2004-02-15 15:37 UTC by Giuseppe Castagno (aka beppec56)
Modified:	2017-05-20 09:32 UTC (History)
CC List:	1 user (show)

See Also:
Issue Type:	DEFECT
Latest Confirmation in:	---
Developer Difficulty:	---

Attachments
Java app code document and description to reproduce (689.37 KB, application/octet-stream) 2004-02-15 15:40 UTC, Giuseppe Castagno (aka beppec56)	no flags	Details
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this issue.

Description Giuseppe Castagno (aka beppec56) 2004-02-15 15:37:24 UTC

This happens after heavy automated work on jumping back and forth on hyperlinks.

OO build: 645m27s2(build:8739)

It seems a duplicate of  Issue 23706, but I use close() in what I think is the
right way
(see my Java code for details).

OS: Windows XP Professional SP1 English, PC with 512 mb ram.

To reproduce the behaviour you need the Java application I used, this will
generate the information needed
by Ghostscript to build a PDF file with hyperlink intact.

Other stuff needed: Java SDK 1.4.1_07, Java RE 1.4.1_07, NetBeans 3.5.1, the
provided zip file containing
the Java application source (under GPL) and a sxw file used to test.
To carry out the test Ghostscript is not needed.

Istruction on how to reproduce it and the Java code are in the file
reproduce.txt in the provided OOPdfBuilder.bug.zip.

This behavior doesn't show up with the same intensity in build 645m19(Build:8693).

Comment 1 Giuseppe Castagno (aka beppec56) 2004-02-15 15:40:50 UTC

Created attachment 13143 [details]
Java app code document and description to reproduce

Comment 2 ooo 2004-02-16 09:49:00 UTC

forwarding to responsible developer

Comment 3 andreas.schluens 2004-02-16 10:43:04 UTC

This task is duplicate to #i14397#. Such deadlock bugs will be fixed for target
OOo 2.0 in general.

The problem in your code: you dont use the "wait" parameter for your print()
requests and close it afterwards - but ignore the exception.

You should use the "wait" parameter on your print requests and play with
close(true). May it will work a little bit better then.

Comment 4 Giuseppe Castagno (aka beppec56) 2004-02-16 15:15:37 UTC

I corrected the code as suggested: added the wait property to the parameter of
the print() and changed close(false) to close(true) thank you for suggesting it.
After this 1.1.0 was better (more stability and no crash but with long documents
this case my problem somewhere else), but with 1.1.1b it still locked in the
close(true) function.
One last thing: where is "wait" documented ? here:
http://api.openoffice.org/docs/common/ref/com/sun/star/view/PrintOptions.html
it isn't, I found examples in the api ML though.

Comment 5 andreas.schluens 2004-02-17 08:10:06 UTC

You are right. This parameter seems to be not documented in any version of this
document. Sorry - our fault. I will fix that ASAP.

Its fine that your test works a little bit better now. But as I already
mentioned - our threading problems are not easy to fix. Please stay tuned for an
OOo 2.0. We adressed such problems as to be fixed for that version ... hopefully :-)

Comment 6 Martin Hollmichel 2004-06-04 14:41:08 UTC

reassigned back to as.

Comment 7 Giuseppe Castagno (aka beppec56) 2004-06-04 17:10:38 UTC

I think that this is the same behavior I described in
 
http://qa.openoffice.org/issues/show_bug.cgi?id=28355

so I add some comment on what I've done so far, this can be applied to i28355 as
well, even though that issue is now closed.

Since I didn't like this OOo behavior and 2.0 is still not stable, I debugged
OOo directly, continuing as I stated in i28355. I found that part of the problem
was that some array wasn't locked for exclusive access by the OOo running threads.

I then added a new mutex (in sal project) that is now used to interlock the data
access (so it's used in sw project, in some of the code in there, in more than a
single file).

Of course this runs only in Windows, mainly because I don't have the possibility
to check other OS's, and it solved only the problem at hand. One of the problem
in other OS can be the way the mutex is implemented there (is it possible to
nest mutex calls it in Linux, for example ? I honestly don't know).

In any case in a dual processor PC (Win NT 4 Server+sp6a), where it didn't run
at all before adding the data mutex, it does not run well yet, at a certain
point, OOo stops running and I suspect that there is some deadlock situation to
be checked out. I can't debug on the dual processor PC unfortunately.
After the mutex adding, the issue has showed up on a single processor PC's, so
for our 'in house' use this problem is patched, at least for the time being and
in vers. 1.1.2.

From this you can have an idea if this issue can be closed or not or how to proceed.

Comment 8 andreas.schluens 2004-06-07 08:32:06 UTC

This bug here does not describe the same behavior as described in
"http://qa.openoffice.org/issues/show_bug.cgi?id=28355"!

Here you described a deadlock, which could be workaround by using the APi in the
right way. There you had a crash (not a deadlock), which could be fixed by using
a mutex. 

Further the Issue "http://qa.openoffice.org/issues/show_bug.cgi?id=23706"
describe a missusing of the API - so it cant be related to this bug here too.

I set your bug as duplicate to
"http://qa.openoffice.org/issues/show_bug.cgi?id=14397".

The question is: How do you reached a situation, where the same writer document
was scripted from different threads - changing the same enumeration? I would say
that these threads must be established by you and not be the office itself. How
do you synchronize these outside threads?

(BTW: Why do you added a new mutex to the sal-project? We already have several
mutex implementations!)

Further: As I already mentioned ... We try to fix these multithreading problems
for our OOo 2.0 version. Please stay tuned for our solution of that.

*** This issue has been marked as a duplicate of 14397 ***

Comment 9 andreas.schluens 2004-06-07 08:33:32 UTC

reopened to sent it back ...

Comment 10 andreas.schluens 2004-06-07 08:34:18 UTC

Comment 11 Giuseppe Castagno (aka beppec56) 2004-06-07 23:04:44 UTC

Oops, I see that I didn't explain myself correctly, I screw up the meaning.

A brief history.
First of all when I say run OOo I mean run OOo as a server through Java
application client. In any case I have no problem at all with OOo as a stand
alone application.

I don't use OOo with a multithreaded Java application, there is only one Java
thread running and at the same time OOo is not used for other work. The only
(possibly) multithread by chance is the use of framework commands mixed with UNO
direct functions (e.g. requesting a xViewCursor.getPosition() while a
".uno:OpenHyperlinkOnCursor" was being performed).

The problem I described here in i25506 was the first problem I hit, and indeed
in this case the fault was mine, in the Java appllication. After some correction
on my part, it almost disappeared in Vers 1.1.2.

Almost, because the problem was still there but it changed the behavior by
crashing OOo very often in large files and this brought me to i28355, where I
was (finally !) able to debug OOo directly. In any case, if OOo was run on a
dual processor PC it crashed always.

After what I was told in i28355, I tried with the 680 current at that time and
it worked out, unfortunately the 680 was very unsuitable for the rest of my job
(i.e. still unstable in everyday use), so I resolved into digging deeply in
debugging the 1.1.2.

By debugging OOo (from now on walking into i28355 scope) I always came across an
array whose elements where changed by some thread (don't know which or when,
just let name it thread B) while the crashing thread (let name it thread A)  was
iterating through it (for example through the SwBookmarks array obtained with
rDoc.GetBookmarks()), that is some elements where deleted by an unknown thread B
while thread A was iterating through it. I though it was some sort of data not
locked for unique access by the global mutex that should be active at that moment.

For me it was too difficult to see if the global mutex was active or not, or
misused somehow, so I thought to add another, not another brand new
implementation, but another mutex object using the same implementation of global
mutex (e.g. another variable identical to g_Mutex as declared in
sal/osl/w32/mutex.c) and I used it to lock the access to the SwBookmarks only in
the functions where OOo crashed.

It seems I've partly resolved the issue for my use. In any case on the dual
processor PC while behaving slightly better then before, it now does not crash
but stops (only on this dual processor PC) and OOo eats up all the CPU time
(both of the CPUs !) there I think what I could have is a dead lock, of which
I'm not sure because for now I'm not able to debug it on the dual CPU PC and,
besides, a dead lock shouldn't eat up the CPU time, should it ?

Hope this wasn't too tedious...
Thanks for your patience.
Giuseppe.

Comment 12 christianjunker 2005-07-20 16:38:45 UTC

as the threading system has been heavily under change for 680, I resolve this
for LATER.