Issue 96152 - Java Uno wrapper locks up when connection disappears
Summary: Java Uno wrapper locks up when connection disappears
Status: UNCONFIRMED
Alias: None
Product: General
Classification: Code
Component: code (show other issues)
Version: OOo 2.4.1
Hardware: All Linux, all
: P3 Trivial with 2 votes (vote)
Target Milestone: ---
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords: needhelp
Depends on:
Blocks:
 
Reported: 2008-11-12 14:08 UTC by wdonne
Modified: 2014-02-02 13:14 UTC (History)
3 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description wdonne 2008-11-12 14:08:46 UTC
When a client is using an OOo server process, for example to run a conversion,
it will go in a deadlock if the connection with the server process disappears,
for example by killing the latter.

The following is a stack trace of a client that is stuck in this way:

"main" prio=1 tid=0x08058a08 nid=0x1e69 in Object.wait()
[0xbf901000..0xbf901588]
        at java.lang.Object.wait(Native Method)
        - waiting on <0x44cf0358> (a java.lang.Object)
        at java.lang.Object.wait(Object.java:429)
        at com.sun.star.lib.uno.protocols.urp.urp.writeRequest(urp.java:130)
        - locked <0x44cf0358> (a java.lang.Object)
        at
com.sun.star.lib.uno.bridges.java_remote.java_remote_bridge.sendRequest(java_remote_bridge.java:648)
        at
com.sun.star.lib.uno.bridges.java_remote.java_remote_bridge.sendInternalRequest(java_remote_bridge.java:687)
        at
com.sun.star.lib.uno.bridges.java_remote.java_remote_bridge.getInstance(java_remote_bridge.java:587)
        at
com.sun.star.comp.urlresolver.UrlResolver$_UrlResolver.resolve(UrlResolver.java:140)
        at be.re.ooo.Uno.getComponentFactory(Uno.java:61)
        at be.re.ooo.Uno.getComponentLoader(Uno.java:93)
        at be.re.ooo.UnoConverter.convert(UnoConverter.java:72)
        at be.re.ooo.Reporter$ConvertOutputStream.close(Reporter.java:504)
        at
java.util.zip.DeflaterOutputStream.close(DeflaterOutputStream.java:149)
        at java.util.zip.ZipOutputStream.close(ZipOutputStream.java:311)
        at be.re.odf.Reporting.merge(Reporting.java:1056)
        at be.re.ooo.Reporter.merge(Reporter.java:374)
        at be.re.ooo.Reporter.main(Reporter.java:196)
Comment 1 Olaf Felka 2009-04-28 12:32:37 UTC
@ sg: Please have a look.
Comment 2 wdonne 2009-05-18 18:21:39 UTC
I'm now using version 3.1.0 and the problem also exists there.

There seems to be a difference when running the server process on Windows or Solaris. When the server 
disappears on Solaris the client is not locked, while when the same happens with a server process on 
Windows the client remains locked.
Comment 3 steffen.grund 2009-06-22 12:48:16 UTC
I am not sure what the fix should be here, since something goes wrong here with
the server.
I assume that this can be fixed easily by restarting the client?
Comment 4 wdonne 2009-06-22 13:33:43 UTC
It is difficult to reproduce. I don't know the code, but there may be a synchronized section too many in 
the client stub. Restarting the client is indeed a workaround.
Comment 5 Jason Powers 2012-02-29 18:57:14 UTC
I've run into this as well, and our load tester seems to be able to reproduce it fairly easily. Will update more details on that once I have them.

For my code we're using the 'client' inside of a server process. This makes restarting the client a bit less desirable.

It seems to happen when attempting to open a new connection or an instance of OpenOffice that has locked up or is otherwise in a bad state. It seems to just sit on a wait hoping to be notified by another thread. It just never happens.

I've updated a copy of writeRequest to look like this:
// @see IProtocol#writeRequest
    public boolean writeRequest(
        String oid, TypeDescription type, String function, ThreadId tid,
        Object[] arguments)
        throws IOException
    {
        if (oid.equals(PROPERTIES_OID)) {
            throw new IllegalArgumentException("illegal OID " + oid);
        }
        synchronized (monitor) {
            while (!initialized && state != STATE_TERMINATED) {
                try {
                    monitor.wait();
                } catch (InterruptedException e) {
                    Thread.currentThread().interrupt();
                    throw new RuntimeException(e.toString());
                }
            }
            if (state == STATE_TERMINATED) {
                throw new DisposedException();
            }
            return writeRequest(false, oid, type, function, tid, arguments);
        }
    }

Adding in the state != TERMINATED. This appears to have fixed the problem so far, but I think it still needs a timeout on the monitor.wait() call so it doesn't sit there indefinitely waiting.

When I detect a stuck instance my code kills the OpenOffice Instance, and also calls close on the bridge object associated with this. That code flips the state to STATE_TERMINATED and notifies on monitor. I'm not sure if that will work for all users, so a timeout would probably be needed here as well.