Apache OpenOffice (AOO) Bugzilla – Issue 87092
Find and Replace with formatting inserts unwanted "x"
Last modified: 2013-08-07 14:43:00 UTC
I think this is a regression; I have certainly noticed it only in the last couple of months. If I use a macro to replace all italic formatted text in a document with plain text surrounded by the equivalent html markup, a spurious "x" is added at the end of the replacement string. Steps to repeat: (1) write a document containing two or more stretches of italicised text. (2) run the following macro Expected result: the italic text ranges are converted to plain text and given tags at each end that say "<em> and "</em>" Actual result: the italic text ranges are converted to plain text and given tags at each end that say "<em> and "x</em>" Note the extra, spurious "x" The macro that does this: Sub ital_to_HTML_Tags dim oDoc,oRD,numfound dim italicargs(0)as new com.sun.star.beans.PropertyValue dim normalargs(0)as new com.sun.star.beans.PropertyValue oDoc=thiscomponent oRD=oDoc.createReplaceDescriptor() italicargs(0).Name="CharPosture" italicargs(0).Value=com.sun.star.awt.FontSlant.ITALIC oRD.ValueSearch=True oRD.SearchWords=True normalargs(0).Name="CharPosture" normalargs(0).Value=com.sun.star.awt.FontSlant.NONE oRD.setSearchAttributes(italicargs()) oRD.setReplaceAttributes(normalargs()) oRD.setSearchString(".+") oRD.setReplaceString("<em>&</em>") oRD.SearchRegularExpression=True oRD.SearchStyles=False numfound=oDoc.ReplaceAll(oRD) ' msgbox(numfound) end Sub An equivalent macro to replace bold formatting with <b> tags has the same bug.
Reassigned to SBA.
I came across the same problem independently of reporter. Can confirm bug is present in beta 4.0 rc2 and rc6; likely other rc's too. Bug not present in 2.3.1 or earlier. Looks like some new feature or fix has created this problem. Bug can be isolated to oDoc.ReplaceAll. Tested with Italic and Bold.
keyword added after confirmation from jurf
Bug also confirmed on OOo 3.0 beta m9 (OOo-Dev_DEV300_m9_Win32Intel). This is a definite regression introduced in the 2.4 beta series. Unfortunately, oDoc.ReplaceAll is a vital component of macros I use to clean up formatting (e.g. in OCR output); as such, this issue is a showstopper as it prevents me moving beyond OOo 2.3.
Urk, it's not just macros: I also get stray "x"s using the "Find & Replace" UI (OOo-dev 3.0). To reproduce, paste in a bunch of paste, mark part of it bold, and an overlapping section as italic. eg. Gimme some x's: [b]with luck following here [i]and here[b] - see?[/i] Rats. (don't copy the tags - they're only there to suggest which bits you turn to bold and which to italic) The "Find & Replace" dialog settings are these: Search for: .+ Format: bold Replace with: [b]&[/b] Regular Expressions -> Replace All Ouch. Looks like the bug affects code that splits the tags to avoid overlaps. Workaround needs a macro: before running a search/replace involving formatting, protect your x's by switching them to some unique tag. You've run your replace routine, delete all the x's, and revert the temporary tags back to real x's.
To reproduce this there is no need for overlapping attributes, just a dummy text with one bold word and you will get an additional "x". Even I think there is a bug somewhere in an involved external library, I can easily fix this in my code. Fixed in CWS sw30bf05 findtxt.cxx
SBA: Adjusted summary. Was OK in OOo 2.3.1, broken in OOo 2.4.
*** Issue 83938 has been marked as a duplicate of this issue. ***
*** Issue 87642 has been marked as a duplicate of this issue. ***
*** Issue 88726 has been marked as a duplicate of this issue. ***
fix copied to CWS dba241e, which targets to 2.4.1
fs->sba: please verify in CWS dba241e
ama->sba: Checked in dba241e => Ready for QA. But don't close this issue afterwards because you have to verify it in sw30bf05 (DEV300) again.
Verified in CWS dba241e.
SBA: Re-Verified in OOH680_m17 (=> OK in OOo 2.4.1) Not closing now, see AMAs comment above.
.
@ andrewb Can you verify it in last master?
SBA: OK in Build OOO300_m7 (OOo 3.0 RC2). Closed.