Bug 65631

Summary: Proxy CONNECT error 500
Product: Apache httpd-2 Reporter: Pavel Mateja <pavel>
Component: mod_proxy_connectAssignee: Apache HTTPD Bugs Mailing List <bugs>
Status: RESOLVED INVALID    
Severity: normal    
Priority: P2    
Version: 2.4.51   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Attachments: Patch for this bug
filtered tcpdump output of one request returning 500
trace for one 500 request

Description Pavel Mateja 2021-10-13 16:16:02 UTC
Created attachment 38066 [details]
Patch for this bug

Hi,
I was getting lots of code 500 for CONNECT method in my proxy access log and very long lasting requests in server-status lately.
I tried to dig a little bit so you can check end of debug/trace output of one such request (ip addresses and server name were obfuscated, sorry):

...
proxy_util.c(4909): [client AB.XYZ.48.134:9045] proxy: CONNECT: #0: origin: 1/31
proxy_util.c(4810): [client AB.XYZ.48.134:9045] proxy: CONNECT: origin input ready
proxy_util.c(4644): [client AB.XYZ.48.134:9045] ap_proxy_transfer_between_connections complete (from AB.XYZ.49.14:443)
proxy_util.c(4878): [client AB.XYZ.48.134:9045] proxy: CONNECT: polling (client=31, origin=31)
proxy_util.c(4902): [client AB.XYZ.48.134:9045] AH10215: proxy: CONNECT: woken up, 1 result(s)
proxy_util.c(4909): [client AB.XYZ.48.134:9045] proxy: CONNECT: #0: origin: 1/31
proxy_util.c(4810): [client AB.XYZ.48.134:9045] proxy: CONNECT: origin input ready
proxy_util.c(4644): (70014)End of file found: [client AB.XYZ.48.134:9045] ap_proxy_transfer_between_connections complete (from AB.XYZ.49.14:443)
proxy_util.c(4838): [client AB.XYZ.48.134:9045] proxy: CONNECT: origin read shutdown
proxy_util.c(4878): [client AB.XYZ.48.134:9045] proxy: CONNECT: polling (client=35, origin=0)
proxy_util.c(4902): [client AB.XYZ.48.134:9045] AH10215: proxy: CONNECT: woken up, 1 result(s)
proxy_util.c(4909): [client AB.XYZ.48.134:9045] proxy: CONNECT: #0: client: 4/35
proxy_util.c(4943): [client AB.XYZ.48.134:9045] proxy: CONNECT: client output ready
proxy_util.c(4966): [client AB.XYZ.48.134:9045] proxy: CONNECT: client write shutdown
proxy_util.c(4878): [client AB.XYZ.48.134:9045] proxy: CONNECT: polling (client=31, origin=0)
proxy_util.c(4902): [client AB.XYZ.48.134:9045] AH10215: proxy: CONNECT: woken up, 1 result(s)
proxy_util.c(4909): [client AB.XYZ.48.134:9045] proxy: CONNECT: #0: client: 31/31
proxy_util.c(4810): [client AB.XYZ.48.134:9045] proxy: CONNECT: client input ready
proxy_util.c(4544): (104)Connection reset by peer: [client AB.XYZ.48.134:9045] AH03308: ap_proxy_transfer_between_connections: error on client - ap_get_brigade
proxy_util.c(4644): (104)Connection reset by peer: [client AB.XYZ.48.134:9045] ap_proxy_transfer_between_connections complete (to AB.XYZ.49.14:443)


The point is that the proxy got 'End of file found' but tried to continue.

I'm not sure if it's not obsolete, but RFC2817 says:

5.3 Establishing a Tunnel with CONNECT
...
If at any point either one of the peers gets disconnected, any
outstanding data that came from that peer will be passed to the other
one, and after that also the other connection will be terminated by
the proxy. If there is outstanding data to that peer undelivered,
that data will be discarded.

Therefore I tried to apply simple and ugly hack and the strangest thing has happend. It seems to change those pesky 500 to 200 and I don't see any hanging connections anymore.
Can somebody check attached patch and write me what he or she thinks of it please?

Thanks,

Pavel Mateja
Comment 1 Yann Ylavic 2021-10-13 17:31:09 UTC
Hi Pavel,

> I was getting lots of code 500 for CONNECT method in my proxy access log and
> very long lasting requests in server-status lately.

The logs show no timings so the "very long lasting requests" point is difficult to diagnose here.

> I tried to dig a little bit so you can check end of debug/trace output of
> one such request (ip addresses and server name were obfuscated, sorry):
[snip]
> 
> The point is that the proxy got 'End of file found' but tried to continue.

Yes, the tunnel (and TCP) handles half-closing (i.e. one way closing then the other) because some tunneled protocols work like this: once one end has nothing more to write it will shut down its connection "for write" but will still read any data sent by the peer until it in turn shuts down "for write" itself.
Thus it's only when the connection is shut down in both ways that the tunneling ends.

> 
> I'm not sure if it's not obsolete, but RFC2817 says:
> 
> 5.3 Establishing a Tunnel with CONNECT
> ...
> If at any point either one of the peers gets disconnected, any
> outstanding data that came from that peer will be passed to the other
> one, and after that also the other connection will be terminated by
> the proxy. If there is outstanding data to that peer undelivered,
> that data will be discarded.

Unfortunately this does not define what "disconnected" means, half-closed connections are still "connected" for instance in some way, but connections reset (implying both ways) are not.
So mod_proxy will stop the tunneling (per above) if a read returns an error, but EOF (End Of File) is not considered an error.

> 
> Therefore I tried to apply simple and ugly hack and the strangest thing has
> happend. It seems to change those pesky 500 to 200 and I don't see any
> hanging connections anymore.

Where does the tunneling hangs actually? Timings?

What should happen is that when the proxy sends a shutdown/FIN on the opposite side of the tunnel where it received the shutdown/FIN itself, and when the peer receiving the FIN is not willing to send more data (which is immediately the case of any application that does not handle half-closing), the peer should in turn close/FIN the connection which the proxy detects and then forwards on the other side, etc until everything is shut down both ways.
Some applications don't close/FIN the connection when they receive a FIN but rather reset/RST it, which seems to be your case. Still the tunneling should end as soon as this RST is received by the proxy, so it does not explain the "hang".

> Can somebody check attached patch and write me what he or she thinks of it
> please?

It breaks half-closing so before anything it would be useful to understand why it hangs in the first place..

As for the 500 in the access, it's possibly something the proxy can do about if it happens during half-closing..
Comment 2 Pavel Mateja 2021-10-14 12:08:36 UTC
Hi Yann,
you are right.
One of our applications did not close sockets after proxied requests properly.
I was able to see long lasting connections in FIN_WAIT2 state but still assigned to Apache's PID. Like
tcp 0 0 AB.XYZ.49.8:8080 AB.XXX.48.134:53119 FIN_WAIT2 109016/httpd
instead of just
tcp 0 0 AB.XYZ.49.8:8080 AB.XXX.48.134:1916 FIN_WAIT2 -
In such case kernel won't clean them up.

I can see
AH10212: proxy: CONNECT: tunnel running (timeout -1.000000)
in debug log.
Which configuration parameter sets this timeout nowadays? ProxyTimeout did not do the trick.

This was just fixed on application side and such connections are no more.

But those 500 in logs are still present.
I've tried older Apache versions and I don't see them with 2.4.46.
Comment 3 Ruediger Pluem 2021-10-14 13:39:55 UTC
(In reply to Pavel Mateja from comment #2)

> 
> But those 500 in logs are still present.
> I've tried older Apache versions and I don't see them with 2.4.46.

I guess the 500 happens to be logged if one side shuts down the connection not with a regular FIN but with a RST.
Comment 4 Pavel Mateja 2021-10-14 14:44:59 UTC
Created attachment 38069 [details]
filtered tcpdump output of one request returning 500

I can see the client sent RST after it got FIN from proxy in tcpdump output.
Comment 5 Pavel Mateja 2021-10-14 14:46:20 UTC
Created attachment 38070 [details]
trace for one 500 request

And there we have proxy trace for this one.
Is it bug in client then?
Comment 6 Ruediger Pluem 2021-10-14 15:25:18 UTC
(In reply to Pavel Mateja from comment #5)
> Created attachment 38070 [details]
> trace for one 500 request
> 
> And there we have proxy trace for this one.
> Is it bug in client then?

I tend to say yes as it should sent a FIN back to close the connection.
Comment 7 Yann Ylavic 2021-10-14 16:09:42 UTC
(In reply to Ruediger Pluem from comment #6)
> 
> I tend to say yes as it should sent a FIN back to close the connection.

Right because without this FIN/ACK there is no way for the proxy to determine whether the last packet sent was received by the peer (besides looking at the sequence number of the RST, which we don't have access to at mod_proxy level..).
So it may be an error or not.

Maybe a SetEnv "proxy-tunnel-no-error-for-rst-on-fin" (or alike) could forcibly control whether we tolerate a RST on a FIN-ed connection, though it's for the final status in access log "only" (it won't change anything from the tunneling POV).
Comment 8 Pavel Mateja 2021-10-15 11:55:12 UTC
This looks like a bug in the proxy client and not in Apache.
Comment 9 Yann Ylavic 2021-10-15 12:04:04 UTC
(In reply to Pavel Mateja from comment #2)
> Which configuration parameter sets this timeout nowadays? ProxyTimeout did
> not do the trick.

There is r1894290 to honor the worker's timeout= parameter, ProxyTimeout or Timeout (in that or order) in mod_proxy_connect.
Comment 10 Graham Leggett 2021-12-10 13:25:54 UTC
Backported to v2.4.52.