Apache OpenOffice (AOO) Bugzilla – Issue 55464
Writer losing large portion of my document
Last modified: 2005-10-04 21:11:22 UTC
I have experienced a reproducable bug where Writer loses a large piece of my document. Unfortunately, it only occurs in one document that I have, and due to it's nature, I'm not able to attatch this document. I will try to describe the situation in as much detial as possible: The original document is in Word 2000 format. The original is also password protected, but I've tried it without the password protection and this does not have any effect, so that is not related. A part of my document looks like this, under Word: AAAABBBB CCCC Both A and B are one continuous block of text, and are labelled different only to help illustrate the change that occurs. In fact, the A-B join is mid-word, not whitespace of any kind. C is the following paragraph. Overall, these blocks represent about 20,000 words. When imported and displayed under Writer, it displays as: AAAA BBBB CCCC A mysterious new line has appeared, which is, as previously mentioned, mid-word. Enabling the option to show hidden formatting characters shows only a newline character, nothing unusual. Obviously I don't want this new line, so I position the cursor at the end of A, and press the delete key once to bring B upwards. What happens however, is that B disappears completely, and I'm left with: AAAA CCCC If I choose the undo option here, the text does not return - it's lost and I have so far found no way to recover it, other than reloading the document. If I position the cursor at the start of B, and use backspace to achieve the same effect, an even larger block of text disappears, which seems to extend from the end of A, far up into my document. Possibly 50,000 words or more. I also tried positioning the cursor at the end of A, and typing some text. Any characters I type do not appear. If I delete a few characters from the end of A, say five, then I can replace those five deleted characters with five more, but cannot type beyond that limit. Curiously, if I type some extra characters on the end of this line, although they are hidden, they show up if I cause the bug to occur, leaving me with: AAAA[Extra characters] CCCC This bug occurs in both the 2.0 RC, and the latest 1.x version. I tried saving the document under various formats, including OpenDocument, OpenOffice 1, Microsoft Word, but the same thing occurs in all cases. I tried altering the document in Word first before importing it, applying styles, adding whitespace and other things, and all of these show in Writer, but do not help the issue. I tried cutting the document down to just the area that has the problem, and this causes the problem to disappear. However, deleting half of my document is obviously not a solution. Copying the area of the document to a new, blank document, also does not exhibit this bug. Copying the entire document to a new, blank document *does* exhibit this bug. Clearly, the position is important in some way. It's on page 146. As mentioned, I cannot provide the document, but if there any any debugging tools or other methods I can try to see any 'hidden codes' and such at the point where the problem occurs, or any more information you would like to know, I am happy to provide. Possibly I could write some kind of script to scramble the text so that I could submit it. I can reproduce this bug on demand, so I can also try other suggestions and report the result if requested. If screenshots would be helpful in any way, I could provide those too. I am using Windows 2000 Professional SP4.
It sound like a too long paragraph. The size of a paragraph is limited to 65535 characters. Is the paragraph longer than that? (which would be issue 17171) Can you send the document with private mail to a developer? Or can you exchange the content with a dummy-text? Other ideas: Have you tried to use the "Web Layout"? Are you using a non latin language, are you using a spezial font? Perhaps the text is inside a frame in Word? Frames cannot extend over several pages.
The block where the problem occurs is, according to Word, 156,868 characters in length (including spaces) and is continous text. After reading #17171, I did a character count in Writer up to the mid-word break, and indeed, it's 65,534 characters. Thanks for identifying the problem for me, but can I ask why this issue (17171) is given such low priority? I understand the argument that it's not ideal to be writing such long paragraphs, but it's a serious issue that writer DISCARDS large amounts of text if any attempt is made to rejoin the paragrpahs. Undo has no effect. At the very least, Writer should not allow any actions that would cause this to happen, and present an error and explanation to the user. Simply 'losing' the text is unnaceptable.
So I will close this as duplicate to issue 17171. I'm not a developer but only oooqa-member, so I cannot tell you anything about the Target Milestone decisions. Please write your comments to issue 17171, not here. If it is important for you, you can vote for that issue. *** This issue has been marked as a duplicate of 17171 ***
closing duplicate