Issue 59185 - Search and Replace ignores max-character-limit of paragraphs (may cause data loss)
Summary: Search and Replace ignores max-character-limit of paragraphs (may cause data ...
Alias: None
Product: Writer
Classification: Application
Component: editing (show other issues)
Version: OOo 2.0.1
Hardware: All All
: P3 Trivial (vote)
Target Milestone: ---
Assignee: AOO issues mailing list
QA Contact:
Keywords: oooqa
: 42899 (view as issue list)
Depends on: 17171
  Show dependency tree
Reported: 2005-12-10 17:13 UTC by ftack
Modified: 2017-05-20 11:15 UTC (History)
3 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---

Document to reproduce the document corruption issue with regex search (11.41 KB, application/vnd.oasis.opendocument.text)
2005-12-10 17:14 UTC, ftack
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description ftack 2005-12-10 17:13:21 UTC
Steps to reproduce the problem

* Open the attached doc "longissue.odt". It is a 33 page doc, with blank lines
created using two paragraph breaks, and single paragraph breaks after single
sentences. Note at the very end of the document a line: "This is the last line.
In the original doc, the above three paragraphs were repeated 100 times." 100 is
counted using a Numer range field. Typically, a more advanced user may want to
clean up this document, turning the two consecutive paragraph breaks into one
paragraph break, and removing the enters between the lines to combine them to
one paragraph.

The following procedure works for short pieces of text, but on a moderately size
doc such as this, it causes severe data loss.

* Find "$" replace with "!token!", regular expressions checked. As yet another
issue with regex, not all paragraph marks end up being replaced by !token!.
Repeat the find/replace operation hittng "replace all" again. the search now
takes significantly longer, and the wakefull eye already sees that by now
document already is missing its end and is corruptes, so we could already stop
here. However, usually, one would notice this only later.

* Now Find "!Token!!Token!" replace with "!para!", regex not checked to later on
replace all two consecutive paragraph breaks with one.

* Replace "!Token!" with " " (space) regex not checked.

* Replace "!para!" with "\n" regex checked to insert paragraph breaks again.

Note that we are left with 15 pages only, the last line cut of. Note also that
the Fields have been corrupted: "Instance x" counts to 10, then further on the
fields are removed.

This is an issue resulting in severe data loss and document corruption which
therefore should have a high priority.
Comment 1 ftack 2005-12-10 17:14:24 UTC
Created attachment 32271 [details]
Document to reproduce the document corruption issue with regex search
Comment 2 lars 2005-12-10 19:12:26 UTC
confirmed on Windows XP Pro SP2 with OOo 2.0.1 RC4
Comment 3 rayll 2005-12-10 22:06:34 UTC
Confirmed on OO0 2.0, Suse Linux 9.3.

At the end, _all_ of my field numbers were corrupt, including the first 10.
Comment 4 michael.ruess 2005-12-12 09:24:28 UTC
Reassigtned to SBA.
Comment 5 michael.ruess 2005-12-12 09:25:35 UTC
Reassigned to SBA.
Comment 6 stefan.baltzer 2005-12-12 11:52:45 UTC
SBA: P2 is the correct Prio for data loss. The office itself is running and
still usable.
Prio set to P2.
Comment 7 ftack 2005-12-12 12:35:56 UTC
Indeed, on, it was braught to my attention that OOo seems to have a
limit of 64 K for a single paragraph. Therefore, this issue is a consequence of
issue 17171. Issue 17171 has a priority of only 4, although in many different
ways, it may be at the cause of data loss.
Comment 8 stefan.baltzer 2005-12-15 11:39:09 UTC
SBA: Reassigned to OS. Target set to OOo 2.03
Comment 9 lohmaier 2005-12-19 14:31:53 UTC
*** Issue 42899 has been marked as a duplicate of this issue. ***
Comment 10 lohmaier 2005-12-19 14:37:08 UTC
extended summary to match the real issue here.

original summary "Regular expressions search replace may result in severe data loss"

Search and replace concatenates paragraphs even when the result exceeds the
maximum character limit of 65534 characters (see issue 17171).

Search-and-replace should force paragraph-breaks at that limit to avaoid
data-loss (and maybe report an error that the limit was reached).
Comment 11 Mathias_Bauer 2006-03-13 16:27:56 UTC
I think the targer 2.0.3 is still too optimistic; it looks like a pretty exotic
Comment 12 Oliver Specht 2006-06-20 13:13:42 UTC
According to
prio changed to P3
Target adjusted
Comment 13 Martin Hollmichel 2007-09-10 13:36:03 UTC
move target to 3.x according
Comment 14 Marcus 2017-05-20 11:15:12 UTC
Reset assigne to the default "".