Apache OpenOffice (AOO) Bugzilla – Issue 69660
A Hebrew WPD document that hangs probably the layout engine
Last modified: 2017-05-20 11:17:54 UTC
This document is a WordPerfect document that is converted correctly by the WP importer, but it hangs OOo (soffice.bin at 95-98%) for at least 20 minutes when trying to render it.
Since the document has 20MB, my ADSL line is just not practical to upload it. Will do tomorrow from the office
Created attachment 39239 [details] The document that hangs soffice.bin for over 15 minutes
.
MRU->SBA: please have a look; the attached WPD document will only open as an empty page on my system.
You will have to use 2.0.4 rcN or pp4 because in the previous version, because of a bug in libwpd, this file was causing libwpd to throw a file exception, so the importer did not get any content. That is why there was an empty page. Now, with the integration of fs06, libwpd is much more proof to all corruptions in wpd files and this file is properly converted. But while it is rendered, it simply hangs and soffice.bin uses all available resources. I tried to load it in the evening of one day and in the morning of the next one, it was still hanging.
Just an additional information: We have a command line tool that converts a wpd document into sxw. You can make this tool output the resulting xml into the stdout. When doing this, the resulting xml passes the "well formed" test and is valid. AbiWord takes also some minutes to render it, but eventually it shows the document. It is ~620 pages of hebrew text. Do not ask me what it says :-)
SBA->JW: Please have a look.
hello fridrich i tried this document on a src680_m195 and i could load it. it needs 5 minutes to load but it loads fine. could you please test it again on your system?
ok, now i can confirm this issue, i do not know why it works in the first try but now i OOo hangs after the document has been loaded. have to find a developer for this
Thanks, Jack for this one. The problem I had is that normally it loads and you see the first page. If you stop there, you feel like it is correctly converted. Trying to scroll to the end triggers a hang (with 99% of processor engaged on my laptop). I left it over night on a powerfull machine to see whether it is not only a missperformance, and in the morning the hang was still there.
ok, further investigations. this file ist not an ordinary Wordperfect file if i open it in Wordperfect 10 a convertion starts, save this file with save as -> it will be saved with an FRM extention, it is an Merged File. saving it as WP5 Wordperfect hangs too. i think this is more a problem with the file itself
It is possible that because of the permissivity of the WP file-format, we produce something that is illegal in OOo. Nevertheless, I would appreciate to know what. I am not able to see myself where it lies. If we find the issue, I will be able to fix it for future generations :-)
following release status meeting -> target 3.x
MRU->OD: please have a look, maybe you can find out what causes the loop in Writer after tha attached wpd has been imported. Could be, that the import gives something illegal - at least MS word says that the document is not a valid one...
The document is imported "correctly" by libwpd. Nevertheless, it 20 MB of Hebrew text with around 1500 footnotes, so the layout engine is stress-tested with it. The html representation of the document looks like this: http://go-oo.org/~fridrich/hebrew.html The fact that MSO says that the document is corrupted is irrelevant, libreoffice is able to create a valid output (and recover as much data as possible) even from documents that are so corrupt that WP itself crashes on them.
Reassigned back to OD as long as he is the "guru" of layout engine. MRU->OD: just see my comment above...
Just a little hint for possible debugging: since Wordperfect stores hebrew text in the documents quite often in a visual order, and since the reverse bidi algorithm is not really well defined and since make libwpd that normally depends only on an STL implementation, depend on icu/fribidi/whatever is a nuissance, I stick in the RTL runs markers that force the LTR rendering. It is possible that the layout engine is not really happy about that, or that it does not know how to handle this kind of situation on a large scale.
Reset assigne to the default "issues@openoffice.apache.org".