Apache OpenOffice (AOO) Bugzilla – Issue 96152
Java Uno wrapper locks up when connection disappears
Last modified: 2014-02-02 13:14:38 UTC
When a client is using an OOo server process, for example to run a conversion, it will go in a deadlock if the connection with the server process disappears, for example by killing the latter. The following is a stack trace of a client that is stuck in this way: "main" prio=1 tid=0x08058a08 nid=0x1e69 in Object.wait() [0xbf901000..0xbf901588] at java.lang.Object.wait(Native Method) - waiting on <0x44cf0358> (a java.lang.Object) at java.lang.Object.wait(Object.java:429) at com.sun.star.lib.uno.protocols.urp.urp.writeRequest(urp.java:130) - locked <0x44cf0358> (a java.lang.Object) at com.sun.star.lib.uno.bridges.java_remote.java_remote_bridge.sendRequest(java_remote_bridge.java:648) at com.sun.star.lib.uno.bridges.java_remote.java_remote_bridge.sendInternalRequest(java_remote_bridge.java:687) at com.sun.star.lib.uno.bridges.java_remote.java_remote_bridge.getInstance(java_remote_bridge.java:587) at com.sun.star.comp.urlresolver.UrlResolver$_UrlResolver.resolve(UrlResolver.java:140) at be.re.ooo.Uno.getComponentFactory(Uno.java:61) at be.re.ooo.Uno.getComponentLoader(Uno.java:93) at be.re.ooo.UnoConverter.convert(UnoConverter.java:72) at be.re.ooo.Reporter$ConvertOutputStream.close(Reporter.java:504) at java.util.zip.DeflaterOutputStream.close(DeflaterOutputStream.java:149) at java.util.zip.ZipOutputStream.close(ZipOutputStream.java:311) at be.re.odf.Reporting.merge(Reporting.java:1056) at be.re.ooo.Reporter.merge(Reporter.java:374) at be.re.ooo.Reporter.main(Reporter.java:196)
@ sg: Please have a look.
I'm now using version 3.1.0 and the problem also exists there. There seems to be a difference when running the server process on Windows or Solaris. When the server disappears on Solaris the client is not locked, while when the same happens with a server process on Windows the client remains locked.
I am not sure what the fix should be here, since something goes wrong here with the server. I assume that this can be fixed easily by restarting the client?
It is difficult to reproduce. I don't know the code, but there may be a synchronized section too many in the client stub. Restarting the client is indeed a workaround.
I've run into this as well, and our load tester seems to be able to reproduce it fairly easily. Will update more details on that once I have them. For my code we're using the 'client' inside of a server process. This makes restarting the client a bit less desirable. It seems to happen when attempting to open a new connection or an instance of OpenOffice that has locked up or is otherwise in a bad state. It seems to just sit on a wait hoping to be notified by another thread. It just never happens. I've updated a copy of writeRequest to look like this: // @see IProtocol#writeRequest public boolean writeRequest( String oid, TypeDescription type, String function, ThreadId tid, Object[] arguments) throws IOException { if (oid.equals(PROPERTIES_OID)) { throw new IllegalArgumentException("illegal OID " + oid); } synchronized (monitor) { while (!initialized && state != STATE_TERMINATED) { try { monitor.wait(); } catch (InterruptedException e) { Thread.currentThread().interrupt(); throw new RuntimeException(e.toString()); } } if (state == STATE_TERMINATED) { throw new DisposedException(); } return writeRequest(false, oid, type, function, tid, arguments); } } Adding in the state != TERMINATED. This appears to have fixed the problem so far, but I think it still needs a timeout on the monitor.wait() call so it doesn't sit there indefinitely waiting. When I detect a stuck instance my code kills the OpenOffice Instance, and also calls close on the bridge object associated with this. That code flips the state to STATE_TERMINATED and notifies on monitor. I'm not sure if that will work for all users, so a timeout would probably be needed here as well.