Bug 66468 - Regexp mapper replaces all backslash characters
Summary: Regexp mapper replaces all backslash characters
Status: RESOLVED FIXED
Alias: None
Product: Ant
Classification: Unclassified
Component: Core (show other bugs)
Version: unspecified
Hardware: PC All
: P2 normal (vote)
Target Milestone: 1.10.14
Assignee: Ant Notifications List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-02-06 14:57 UTC by Christian Stein
Modified: 2023-03-02 06:47 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Christian Stein 2023-02-06 14:57:41 UTC
The code here https://github.com/apache/ant/blob/014e94e5d1e289acaf8a4e6e7234662c065d48b2/src/main/org/apache/tools/ant/util/RegexpPatternMapper.java#L133-L161 replaces all "\\X" occurances with "X". The intentional backslash is not preserved.

On Windows, this may convert paths in the "to" argument from "C:\a\b\c" to "C:abc"
Comment 1 Jaikiran Pai 2023-02-06 15:07:13 UTC
This appears to be a genuine issue. I could reproduce this in a testcase within Ant project. A potential fix (which passes) is as follows:

diff --git a/src/etc/testcases/types/mappers/regexpmapper.xml b/src/etc/testcases/types/mappers/regexpmapper.xml
index 08f0dedc6..4cca53a96 100644
--- a/src/etc/testcases/types/mappers/regexpmapper.xml
+++ b/src/etc/testcases/types/mappers/regexpmapper.xml
@@ -29,4 +29,10 @@
       <regexpmapper from="d/e/(.*)" to="\1" handledirsep="yes"/>
     </mapperresult>
   </target>
+
+  <target name="to-with-backslash-for-non-groups">
+    <mapperresult input="a/j.java" output="foo\bar=j.java">
+      <regexpmapper from="a/(.*)" to="foo\bar=\1" />
+    </mapperresult>
+  </target>
 </project>
diff --git a/src/main/org/apache/tools/ant/util/RegexpPatternMapper.java b/src/main/org/apache/tools/ant/util/RegexpPatternMapper.java
index 144722ab6..ec767f7a7 100644
--- a/src/main/org/apache/tools/ant/util/RegexpPatternMapper.java
+++ b/src/main/org/apache/tools/ant/util/RegexpPatternMapper.java
@@ -142,12 +142,14 @@ public class RegexpPatternMapper implements FileNameMapper {
         result.setLength(0);
         for (int i = 0; i < to.length; i++) {
             if (to[i] == '\\') {
-                if (++i < to.length) {
-                    int value = Character.digit(to[i], DECIMAL);
+                final int nextCharIndex = i + 1;
+                if (nextCharIndex < to.length) {
+                    int value = Character.digit(to[nextCharIndex], DECIMAL);
                     if (value > -1) {
+                        i++; // mark that the next digit (after the backslash) has been consumed
                         result.append(v.get(value));
                     } else {
-                        result.append(to[i]);
+                        result.append(to[i]); // append the backslash character
                     }
                 } else {
                     // TODO - should throw an exception instead?
diff --git a/src/tests/junit/org/apache/tools/ant/types/mappers/RegexpPatternMapperTest.java b/src/tests/junit/org/apache/tools/ant/types/mappers/RegexpPatternMapperTest.java
index 8f95d63d1..9def8bae5 100644
--- a/src/tests/junit/org/apache/tools/ant/types/mappers/RegexpPatternMapperTest.java
+++ b/src/tests/junit/org/apache/tools/ant/types/mappers/RegexpPatternMapperTest.java
@@ -46,4 +46,14 @@ public class RegexpPatternMapperTest {
     public void testHandleDirSep() {
         buildRule.executeTarget("handle.dirsep");
     }
+
+    /**
+     * Test that if the {@code to} attribute of {@code regexpmapper} contains a backslash
+     * character which isn't followed by a digit (representing regex group) then the backslash
+     * doesn't disappear from the output. See bug 66468 for details
+     */
+    @Test
+    public void testBackslashInTo() {
+        buildRule.executeTarget("to-with-backslash-for-non-groups");
+    }
 }


I'll run more tests and consider this fix with a fresher mind tomorrow.
Comment 2 Christian Stein 2023-02-06 15:20:52 UTC
Perhaps supporting a \Q \E ... guard pattern is more stable? Looking at you: "D:\1\2\3" and other paths that contain single number elements.

to = "\1\Q${src}\E\2"
Comment 3 Jaikiran Pai 2023-03-02 06:47:53 UTC
I pushed a fix our upstream repo which matches the patch that I proposed previously. I decided not to do anything more complicated than that or introduce any new semantic in the regex mapping code.

The fix should be available in our next release (no specific dates).