specifically, trying to substitute "\$0" for "[.?]" doesn't seem to work - it is not possible to escape the '\' because of the logic in subst, so it always treats the $ as a literal (ie you get "$0" instead of eg "\."). also, is there any way to mark a substring of the regexp as literal (ie don't interpret metachars)?
Created attachment 8813 [details] Patch to allow escaping of '\' character in substitution string
the above patch also has a fix for Bug 22928 (cutting the first two characters). i think it is slightly neater than the current fix in RE.java 1.14.
Christopher, Please use 4 space indent instead of tabs (if you noticed, the rest of the code does this). Vadim
Previous suggested fix has some problems with handling escaped \, e.g. it's impossible to write string so it produces one \ followed by some backreferenced part (i.e. "\\\$0" wiil produce "\\<some_text>", not "\<some text>" Also the patch introduces imcompatibility. To fix these problems I add new REPLACE_WITH_ESCAPES constant and rewrite subst() so it consistently handles escaped characters.
Created attachment 10615 [details] new suggested fix
i can't reproduce the behaviour you describe with the patch that i wrote. at least when using string literals inside the code, "\\\$0" is not even a valid literal. "\\" is a valid escape sequence, but "\$" is not. if i wanted to include a "\" before whatever $0 is, u need to use "\\\\$0" which results in "\<some_text>". can you show me an actual code snippet which causes the error u describe? also, what incompatibilities are you talking about?
try this: r = new RE("[.?]"); actual = r.subst(".", "\\\\$0", RE.REPLACE_BACKREFERENCES); System.err.println(actual); assertEquals("Wrong subst() result", "\\.", actual); actual = r.subst(".", "\\\\\\$0", RE.REPLACE_BACKREFERENCES); System.err.println(actual); assertEquals("Wrong subst() result", "\\$0", actual); Also your changes introduce incompatibility.
ah, i see what u mean. bummer. by incompatabilities, do you mean trying to subst in "\\" results in an empty string :)? d'oh. nice fix, very 'solid', no fudging :). only thing i would suggest (and perhaps there was a reason taht escapes me) is using ArrayLists rather than Vectors to avoid the overhead of synchronization (which seems unnecessary here).
Incompatibility is that now it's possible to escape '\', but before it's not. I didn't use ArrayList because I'm not sure what is target version of jdk for jakarta-regexp (ArrayList was introduced in 1.2)
oic. thanks for the clarification.
I think that patch introduces more confusion than it solves. \$ without REPLACE_WITH_ESCAPES is still escaped - but logic suggests otherwise. Additionally, behavior of escaping $ with \ is not documented in Javadoc and not reflected in the unit test. Because this behavior is not documented, and it was introduced recently (in previous release), I suggest to change it (and document in javadoc / unit test). I suggest following syntax: When REPLACE_BACKREFERENCES is on: Process all $ as backreferences. No escaping performed at all. When REPLACE_BACKREFERENCES and REPLACE_WITH_ESCAPES both are on: Process all $ as backreferences. Process \ as escape symbol. So, what do you think? Vadim