Apache OpenOffice (AOO) Bugzilla – Issue 13377
SfxAsyncExec_Impl does not catch UNO excpetions
Last modified: 2005-10-10 13:38:01 UTC
Hi, in sfx2/source/control/bindings.cxx, the SfxAsyncExec_Impl class does not catch UNO exceptions, which might be thrown by the dispatch call. As this the last one on the stack, that could catch such an exception, it should do so. There is a suggested patch attached, one might think about a message box to present the error, but I don't know, what you do in these cases. Thx, Joerg
Created attachment 5623 [details] Sample patch to prevent the crash
TM->MH: Please have a look. Don´t exactly know where this one belongs too, but I think "framework" isn´t the right place.
reassigned. mh->mba: I consider crashes always for the next possible target: 1.1 RC ?
I don't think that this fix is correct. Dispatch calls don't throw exceptions (except runtime exceptions), just because we don't want to put the burden of catching them into generic UI code (that usually uses "dispatch"). IMHO we have two options: (1) If the thrown exception is a runtime exception you want to be caught without any further notice: why throw it at all? (2) If it's not a runtime exception, find the code that really should catch it, because these kind of exceptions are not allowed to pass at least the dispatch call.
Hi, > IMHO we have two options: > (1) If the thrown exception is a runtime exception you want to be > caught without any further notice: why throw it at all? The problem here is, that RuntimeExceptions may be thrown by the underlying frameworks in case of errornous conditions. Examples are: - xDisp might point to a remote UNO object. Interprocess connections may fail for different reasons like a crash in the remote process or a cut down tcp/ip connection. Both failures will result in a RuntimeException. - xDisp might point to a disposed UNO object, which then throws a DisposedException which is derived from RuntimeException. - Or as it is in my case, xDisp points to an instance of the service-DispatchProvider (implemented by Andreas as I remember). The implementation of the concrete service, it delegates to, is implemented in python. When there is a syntax error in the python script, the pyuno-framework throws a RuntimeException, which currently flies through the service-DispatchProvider to bindings.cxx. So if I understand you correctly, you would suggest to fix this in the service-DispatchProvider. This would help in my case, but this won't fix the two first problems. IMHO, at all places, where UNO calls are invoked and exceptions cannot be thrown further on (e.g. because the code isn't compiled with exception handling support), the UNO exceptions (and so at least the RuntimeException) must be caught. That's actually one of the largest code quality problems, we have... So just to show you the crashes, I have attached two uno packages. Install a SRX645m6 or later, but switch on custom setup/optional components/PyUNO-Bridge. After installation, place both uno-packages in office-install/user/uno_packages and start pkgchk from the office-install/program directory. It should take a second or so and you shouldn't get any error messages. After starting the office, you have two menu items in the tools/addons menu, choose the first one to see, that the python bindings works correctly, choose the second one to crash. Hope, this helps, Joerg
Created attachment 7076 [details] Sample pyuno package, that has no problems
Created attachment 7077 [details] Sample pyuno package, that provokes the crash
We'll fix it in either way. No idea yet, which we will choose.
I discussed this issue togther with Kay Ramme and our agreement was to set the target to OOo2.0.
Hi, Please give a reason for setting this (late)target. From having some insider knowledge, I can only imagine, that it is difficult to convince QA of the need to get this fix into the source tree (having this strange 30%-of-all-fixes-create-new-bugs-rule). The above rule is nonsense in this case, but I know, that a QA engineer in general does not understand this. I can't think of a fix, that is less dangerous than adding a try {} catch {} around a certain block of code, which is the last on the stack that can catch exceptions. On the other hand, the risks to leave this bug in the code are somewhat uncalculateable, as this is one of the most central pieces of code in the whole office. As exceptions are thrown under exceptional conditions, we cannot be sure, that this bug is already now a reason for a crash, which no qa engineer has discovered yet. It definitly will affect a lot of third party developers, that write extensions for OOo, both for java (e.g. because of a NullPointerException ) and python. In my eyes, with having this bug in the code, we cannot release a version, that we call OOo 1.1 final. At least, 'not in my name' :o). If we can't find an agreement, I will start a discussion in the dev-mailing list about issue criteria for OOo1.1 (which is the only escalation medium, I have). Bye, Joerg
Hi Kay, you should also have a look at my last comment in this issue.
Joerg, first: I totally AGREE with you that this is a really nasty problem (I think we raised it serveral times?) and that this needs to be fixed! But, I am against any hot fix. We don't have an error handling strategy in gerneral in place. To achieve what you (we :-) want - we either need to enable exceptions for all our code and therefor we REALLY need to ensure that all code is exception safe (e.g. does clean ups or frees allocated memory) and can NOT enter an inconsistent state - or, as you already proposed, we need to change every currently not exception aware code calling to exception aware code to catch everthing. This code is IMHO hard to find and probably a lot to fix and therefor risky. I think we are currently far away from both solutions. The second problem is, that just catching exceptions and ignoring them is wrong, we need here an exception-to-error-report functionality, which gives the user appropriate feedback. And the last point, the whole thing is really late in game. But anyway, I think we should be open if you find an adequat solution in time (we should be always open for this kind of improvements) for, say 1.1.1. Summary of points to do: - have an error handling concept in place -> fix the problem correctly and don't hack it :-) - either change all code to be exception aware or change those locations of code calling from non-exception aware to exception aware to catch everything (nasty detail problems: identify the code locations, decide on how to translate (e.g. must translate exception catchers with exceptions enabled -> need to change makefiles accordingly), test) - have an error reporting facility in place And, certainly feel free to discuss this issue on every adquat mailing list and find a solution. It is really NOT my intention to sweep anything under the carpet :*).
Hi Kay, I simply think, that a not reported error condition is better than a crash in a stable release (this is different in a unstable developer build). Strictly speaking, it would be correct to catch away the exception in product builds and to let it fly in non-pro builds, (but this is probably overkill). I would welcome an error concept here (the new scripting framework exactly has the same problem BTW, just mentioned this to John Rice). I can think of a error log file, a optional modeless logger window, message boxes, all best configurable. However, I am certainly not the person to design this, there must be something left to do for you :o). But really remember, we want to deliver a stable (which in first place means a not crashing) release. So I still vote to add the simple try/catch in OOo1.1. When we get a concept in 1.1.1, fine, add the code there then, but fix the crash now. Bye, Joerg
Joerg, again, fixing this problem as you suppose is only a "drop on the hot stone" :-). I am pretty sure that there a lot of other locations in the code which do not catch what they are supposed to catch or which are just not compiled with exceptions enabled. So, we CAN NOT do right currently anyway. So, I suggest to fix your problem in the pyuno binding and just eat all (runtime?) exceptions (you may want to log the messages somewhere).
Hi Kay, I can just repeat, that this is no pyuno problem, it is a UNO component problem, which affects Java, C++ and python binding. So would you suggest to fix this by catching away RuntimeExceptions in the the C++ or Java bridge ?! I also can't follow the 'There are many places, that don't work correctly' argument. I'd fix every place, of which I would be aware, that it provokes a crash, including this one. So, as a Ferengi would say, a crash is a crash is a crash. The office MUST NOT crash. As this is IMHO quite a fundamental question, I put this discussion to dev@openoffice.org ... Bye, Joerg
If we will fix this, we need to do it everywhere in the office code
@J??rg: After a long and heavy discussion we still don't like the idea of catching *all* RuntimeExceptions. IMHO it would be necessary to know more exactly what caused the RuntimeException: if the reason just is that UNO (means: UNO core + language binding) is not able to execute a requested call for whatever reason, it would be fine to get a specialized RuntimeException. The code doing the call and receiving this exception then treats this case as it wouldn't have tried to execute this call at all. Depending on the nature of the call many different but always well defined reactions are possible. A general RuntimeExeption is dangerous, because it is not known what side effects need to be considered, so the caller is absolutely at a loss. IMHO this situation is better handled by our Crash Guard, what currently would happen if the exception is not caught at all. What do you think: should we define such an Exception (derived from css.lang.RuntimeException) and use it in your case? BTW: AsynExec_Impl doesn't exist anymore in OOo2.0, all calls are executed synchronously, so the place where the problem occurs has moved.
Hi, it would be possible to throw a specialized runtime exception in case of e.g. pyuno compilation failures, etc. (However, this solves only the most urgent part of the problem, other situations (e.g. when the pyuno components calls an arbitrary uno component which itself throws a runtime exception) still impose a crash. Introducing a new runtime exception needs to be discussed with the uno team, because consequently the other existing bridges need to be modified also, so if you agree, I'd disuss it on dev@udk. BTW, can you give a concrete example, where a catched runtime exception at this place would leave the office in such an incosistent condition, that crash guard handling is the best exit ? Bye, Joerg
If a runtime exception is thrown, it is *possible* that the component throwing it is damaged, so the only useful way to handle this situation is to save all open documents and close the office process. This is done by the crash guard and so this exception does not need to be handled explicitly. Our case is special. Here we know that the component does not create a stability problem and it is safe to continue. More over, we know exactly what is going wrong, why shouldn't we tell this to the caller? If we don't know the reason of failure or if the caller can't cope with it (as in the general case of runtime exceptions) it is always a risk to continue. IMHO it doesn't need a concrete example, a caller can't handle all possible exceptions, and it is not probable that this leaves everything in a good shape. If continuation after arbitrary runtime exceptions is recommended I don't know why we have them at all. Our case is comparable with the DisposedException: here also the reason is known and it is reported to the caller. The caller then can decide how to proceed. I don't think that we need to change all bridges instantaneously, this could be done after fixing it for PyUNO. Your request seems to indicate that this is important for PyUNO, obviously it's less important for other bindings and so it can become fixed there later. I admit that this proposal does not solve the problem completely. It is very unlikely that every call to a UNO component will catch this new exception, so crashes like in our example will still happen. We can fix it only for selected calls, but that would be true for all other possible approaches also.
Hi Mathias, >Our case is special. Here we know that the component does >not create a stability >problem and it is safe to continue. how about the compromise to catch the RuntimeException in the framework/source/dispatch/servicehandler.cxx:implts_dispatch() method ? I guess, that the current office core code does not yet use this handler, does it ? What if we would leave it reserved for 3rd-party- code or code, which is known not to make the office inconsistent ? This would be the easiest ... > I don't think that we need to change all > bridges instantaneously, this could be > done after fixing it for PyUNO. Agreed, but I just want to be sure, that the way we implement it now can later be used by other language bindings. From what I am aware, we either could use the DisposedException or define a new one in the uno core, that's what I'd like to discuss before commiting something like that. > If continuation after arbitrary runtime exceptions is recommended I > don't know why we have them at all. From my understanding, the runtime exception is implicitly specified at every uno method, because every uno method is allowed to fail. However, when it fails, it must leave its object in a consistent state, otherwise the object itself must call abort() itself ... BTW, on 680m56, there does not appear a menu entry in the Tools menu, has there changed something about the addon mechanism ? Bye, Joerg
Hi Jörg, > how about the compromise to catch the RuntimeException in the > framework/source/dispatch/servicehandler.cxx:implts_dispatch() method ? I guess, > that the current office core code does not yet use this handler, does it ? What > if we would leave it reserved for 3rd-party- code or code, which is known not to > make the office inconsistent ? This would be the easiest ... I don't see a difference from this code to others: a *general* Runtime Exception is something that can't be handled sufficiently. IMHO a useful exception handling needs at least some knowlege about the reason. >> I don't think that we need to change all >> bridges instantaneously, this could be >> done after fixing it for PyUNO. > Agreed, but I just want to be sure, that the way we implement it now can later > be used by other language bindings. From what I am aware, we either could use > the DisposedException or define a new one in the uno core, that's what I'd like > to discuss before commiting something like that. Good idea. :-) I thought about a new exception, but at the end it wouldn't be a problem to use the DisposedException. >> If continuation after arbitrary runtime exceptions is recommended I >> don't know why we have them at all. > From my understanding, the runtime exception is implicitly specified at every > uno method, because every uno method is allowed to fail. However, when it fails, > it must leave its object in a consistent state, otherwise the object itself must > call abort() itself ... I don't think that this will help. Instead of calling abort() the CrashGuard needs to be called to allow for emergency recovery. Another case: damaged installation. Here at least an error message should be shown to the user. End user software never should terminate due to errors without a message to the user. So this should be handled by some code upwards on the call stack. For cases like these we need the "general" RuntimeException. Exceptions that can be handled locally should be specialised. > BTW, on 680m56, there does not appear a menu entry in the Tools menu, has there > changed something about the addon mechanism ? Not intentionally, Carsten said it's possible that there *was* a bug, but most recent masters shouldn't show it. Thanks for the hint, we will check that.
Andreas, please change the code in the "service" protocol handler accordingly. Currently it's OK to catch all Runtime exceptions, hopefully later we can use a more specific one (needs the help of Jörg). I'm not sure wether we find a suitable error message here, but that's not the most important thing now.
AS->JBU: I've fixed the implementation of the service handler and further the job execution class, so it catches ANY UNO exception and ignore it. I can live with that solution, because there we deal with external (means 3rd party) components only. And an office should be crashed by such "outside developed code" .-) On the other side there will be a further problem, which cant be adressed now: If you e.g. write directly a css.frame.ProtocolHandler service in phyton and your script force such runtime error, nobody will catch this exception. The reason behind: There is no central place where it can be done ... Dispatch objects will be provided to many office code places and used there directly. It make no sense to fix all these places (especialy if a special handling for different kinds of exceptions is needed).
jbu->as,mba: Hi Andreas, thx for catching the exception. The other places are currently no 'real-world' but more a ideological use-cases, so that's ok for me. I'd love to have a MsgBox() there, which just would provide something like "Runtimeerror : " and the message of the runtimeexception (somewhat similar to the message box in StarBasic which pops up e.g. when you pass wrong number of arguments to a uno function, etc.). Can you discuss with mba, whether this would be acceptable ? For the script developer, that would be a very big help during development and for the user of a 3rd party component, it gives the chance to report a sensible error message to the 3rd party developer. Bye, Joerg PS: Raised a discussion dev@udk about whether to create a new exception type derived form RuntimeException for this situation would be appropriate.
Hello Joerg! Sorry - but I cant show a MsgBox there. Why? a) We have an UI freeze currently. And I cant show an exception message without a description about the circumstances of this exception. b) Showing of any UI directly from the core isnt possible in general. The only thing I can do : I can establish a new user interaction - you know :-) ? I will file a new issue for that, which will be fixed for an OOo > 2.0 Version. Regards Andreas
reopened for sending bug to the QA.
AS->SG: Please verify this task on the cws "cd01". Ask me for further details. THX.
restore "Fixed" state
Hi Andreas, > b) Showing of any UI directly from the core isnt possible > in general. The only thing I can do : I can establish a new user > interaction - you know :-) ? > I will file a new issue for that, which will be fixed for an OOo > 2.0 Version. ok. Can you let me know the issue number (or shall I file one?) ? Thx and bye, Joerg
Checked this using a Java component that throws a RuntimeException in its constructor. The Exception does not appear on Windows and Solaris -> verified.
checked again on Windows (m130), works -> closed.