Bug 58988 - $ escaping for rewrite
Summary: $ escaping for rewrite
Status: RESOLVED FIXED
Alias: None
Product: Tomcat 9
Classification: Unclassified
Component: Catalina (show other bugs)
Version: unspecified
Hardware: PC All
: P2 major (vote)
Target Milestone: -----
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-02-10 14:22 UTC by Remy Maucherat
Modified: 2017-11-15 21:25 UTC (History)
0 users



Attachments
Let backslashes escape characters (4.69 KB, patch)
2016-02-10 14:53 UTC, Felix Schumacher
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Remy Maucherat 2016-02-10 14:22:38 UTC
The following escaping behavior should be implemented:
https://httpd.apache.org/docs/2.2/mod/mod_rewrite.html#quoting
Comment 1 Felix Schumacher 2016-02-10 14:53:29 UTC
Created attachment 33544 [details]
Let backslashes escape characters

This will enable escaping (quoting) by using an backslash.

Apart from this, it will enable escaping the percent sign by using %%. It also fixes a bug, when % was not followed by a digit or a curly brace.

The functionality for %% should probably not be added.

Documentation for the quotation feature is missing, too.
Comment 2 Remy Maucherat 2016-02-10 14:55:56 UTC
Normally % should be escaped with \% according to the documentation, not anything else.
Comment 3 Remy Maucherat 2016-02-11 12:54:26 UTC
This looks fixed by r1729730
Comment 4 Felix Schumacher 2016-02-11 16:34:40 UTC
Fixed in 9.0.0.M4 and 8.0.33.
Comment 5 Stefan 2017-11-09 18:55:46 UTC
It seems not fixed at 8.5.20 - \%20 was converted to %2520
Comment 6 Felix Schumacher 2017-11-09 19:58:17 UTC
(In reply to Stefan from comment #5)
> It seems not fixed at 8.5.20 - \%20 was converted to %2520

What happens when you don't place the backslash in front of %20?
Comment 7 Stefan 2017-11-10 15:06:52 UTC
With backslash
---
RewriteRule    ^/context/id=(\w{3})(\d{12}).*$          %{CONTEXT_PATH}/site/form/?mode=search&key=Help\%20Desk&key2="$1$2" [NC,NE,L]

RESULT - wrong URL
         fqdn/context/site/form/?mode=search&key=Help%2520Desk&key2="1234"

Without backslash
---
RewriteRule    ^/context/id=(\w{3})(\d{12}).*$          %{CONTEXT_PATH}/site/form/?mode=search&key=Help%20Desk&key2="$1$2" [NC,NE,L]

RESULT - Exception
10-Nov-2017 16:03:52.643 SCHWERWIEGEND [http-nio-80-exec-28] org.apache.coyote.http11.Http11Processor.service Error processing request
 java.lang.NullPointerException
	at org.apache.catalina.valves.rewrite.Substitution$RewriteCondBackReferenceElement.evaluate(Substitution.java:65)
	at org.apache.catalina.valves.rewrite.Substitution.evaluate(Substitution.java:269)
	at org.apache.catalina.valves.rewrite.RewriteRule.evaluate(RewriteRule.java:135)
	at org.apache.catalina.valves.rewrite.RewriteValve.invoke(RewriteValve.java:313)
	at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87)
	at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:342)
	at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:799)
	at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)
	at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:868)
	at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1457)
	at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
	at java.lang.Thread.run(Thread.java:748)
Comment 8 Felix Schumacher 2017-11-10 21:06:31 UTC
When a RewriteRule such as

RewriteRule /abc /r%20

is used, the %2 will be interpreted as a back-reference to a match.

Your condition has two matching groups, but %2 would reference back to a third group, that does not exist. That's where the NPE comes from. (I left out the matching groups for simplicity.)

If you escape the % with a backslash, it will be put verbatim in the (url decoded) rewritten path and finally url encoded into %25.

What you need is a way to encode %20 into a static string - that is a space in this case, right?
Comment 9 Felix Schumacher 2017-11-10 21:30:17 UTC
Can you try and add an R flag to the variant that has the escaped percent sign?
Comment 10 Stefan 2017-11-12 14:20:50 UTC
That's right, I missed %2 as a back-reference.
Adding a R flag does the job. I don't know why and seems a bit tricky, but it solves the problem. 

thank you very much!
Comment 11 Mark Thomas 2017-11-15 21:25:54 UTC
Re-resolving this as FIXED since the original issue is fixed.

%20 is a special case. The rules are based on decoded URIs. The problem is that space is a delimiter for the rules so while there are ways (via regular expressions) to use space in a pattern, there isn't a way to include a space in the substitution.

One option would be to decode the rules but that would likely break existing rules.

Using R is probably the best work-around if you need to include a space in the re-written URI.