Bug 59765

Summary: provide a way to obfuscate/hash IP addresses
Product: Apache httpd-2 Reporter: Eric Covener <covener>
Component: mod_statusAssignee: Apache HTTPD Bugs Mailing List <bugs>
Status: NEW ---    
Severity: enhancement CC: jim, szg0000, toscano.luca
Priority: P2    
Version: 2.5-HEAD   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Attachments: Mod status PrivacyMode directive (directory context aware version)

Description Eric Covener 2016-06-27 17:42:44 UTC
provide a way to obfuscate/hash email addresses...

So /server-status can be more reasonably used publically.

Also, provide an option that says "yes, this is public on purpose".
Comment 1 Eric Covener 2016-06-27 18:28:38 UTC
er, IP addresses :(
Comment 2 Luca Toscano 2016-07-09 15:35:50 UTC
The implementation seems not super difficult, but I have a couple of questions:

1) How would it be better to set the "obfuscate" mode to on? Something like ExtendedStatus in core or by other means?

2) hashing every single IP address with a reasonable function could be good for a lot of reasons (no need for a shared state to assign the same value to the same IP over multiple requests, variety in the hashing functions, etc..) but it could also lead to resources waste while doing hash calculations on busy servers. Would it be reasonable just to remove the "Client" column from status and extended status? This concern might be not relevant with modern CPU, but it is good in my opinion to discuss it.
Comment 3 Eric Covener 2016-07-09 15:57:54 UTC
(In reply to Luca Toscano from comment #2)
> The implementation seems not super difficult, but I have a couple of
> questions:
> 
> 1) How would it be better to set the "obfuscate" mode to on? Something like
> ExtendedStatus in core or by other means?

mod_status directive would be best.  
> 
> 2) hashing every single IP address with a reasonable function could be good
> for a lot of reasons (no need for a shared state to assign the same value to
> the same IP over multiple requests, variety in the hashing functions, etc..)
> but it could also lead to resources waste while doing hash calculations on
> busy servers. Would it be reasonable just to remove the "Client" column from
> status and extended status? This concern might be not relevant with modern
> CPU, but it is good in my opinion to discuss it.

I thought this too, and maybe it's fine for a first pass, but you probably would want to know how many clients were the same at any given time.   Maybe blank out the middle?
Comment 4 Luca Toscano 2016-07-15 10:56:29 UTC
(In reply to Eric Covener from comment #3)
> 
> I thought this too, and maybe it's fine for a first pass, but you probably
> would want to know how many clients were the same at any given time.   Maybe
> blank out the middle?

I would still prefer complete anonymity, not sure if we can avoid completely fingerprinting. If the goal is to offer to the admin a way to know clients connected, we could offer multiple views of mod-status, the privacy one available for everybody (removing the IPs completely) and the more complete one restricted for example to localhost. Maybe this is possible simply with the new directive?

Something like:

<Location "/server-status-admin">
    SetHandler server-status
    Require host 127.0.0.1
</Location>

<Location "/server-status">
    SetHandler server-status
    Require all granted
    ServerStatusPrivacyMode on
</Location>
Comment 5 Luca Toscano 2016-07-20 11:16:52 UTC
Example of what I meant: http://apaste.info/SyZ

This is only a proof of concept and works only with Directory/Location context. The idea is to remove completely the Client IP column and add the sentence "Client IP removed due to privacy mode set." on top of the table.
Comment 6 Eric Covener 2016-07-20 13:56:18 UTC
(In reply to Luca Toscano from comment #5)
> Example of what I meant: http://apaste.info/SyZ
> 
> This is only a proof of concept and works only with Directory/Location
> context. The idea is to remove completely the Client IP column and add the
> sentence "Client IP removed due to privacy mode set." on top of the table.

looks like progress, but I would suggest factoring out just the bit that retrieves the client ip, even if it means a dummy column.
Comment 7 Luca Toscano 2016-07-22 14:07:38 UTC
(In reply to Eric Covener from comment #6)

> looks like progress, but I would suggest factoring out just the bit that
> retrieves the client ip, even if it means a dummy column.

New diff: http://apaste.info/0le

If this is ok, the last step would be to add the possibility to specify a server config other than a per directory one. Never done it but it doesn't seem to be difficult.
Comment 8 William A. Rowe Jr. 2016-07-23 14:37:41 UTC
https://lists.apache.org/thread.html/c4d7a66ca113727a1eb3f2fc3e17e367e08cd38a7fc36d5a252422df@1443710720@%3Csite-dev.apache.org%3E

I'd done this without patching mod_status...

<Location /server-status>
  SetHandler server-status
  <If "%{CONN_REMOTE_ADDR} != '127.0.0.1'">
    SetOutputFilter Sed OutputSed "s#<td>[^<]*</td><td nowrap>#<td>redacted</td><td nowrap>#g"
  </If>
</Location>

This provides no client IP, unless a trusted service (e.g. the host itself) is inspecting the output.

The issue with hashing the IP is that it is reasonably reversible, being only one DWORD of data (excepting IPv6). The salt can be ascertained by examining the salt applied to the requester's own entry in the status output.
Comment 9 Luca Toscano 2016-07-24 08:47:42 UTC
Really nice, didn't know about this config snippet! We could add it (or its mod_lua's version) to the mod_status' doc as quick and effective solution to this problem. I personally don't love the idea of relying on HTML matching to do the replacement, if something changes in mod_status' output for some reason then the sed replacement might stop working. This is really unlikely I know but I am always pessimistic when thinking about worst case scenarios :)

Eric, William: you have tons more experience than me in httpd development, please let me know the best way forward. If we want to go for William's solution I'll update the docs accordingly!
Comment 10 Luca Toscano 2016-10-20 16:56:43 UTC
Re-added a very simple patch in:

http://home.apache.org/~elukey/httpd-trunk-mod_status-privacy_mode.patch

This one is only adds a new Server Directive, IIRC on IRC this was the first one suggested. I'd also see the value of having a new Directive working also with Location blocks, in order to allow request from localhost to display client IPs.
Comment 11 Jim Jagielski 2016-10-20 17:27:16 UTC
I'd prefer that if instead of printing out "" it printed out something like "x.x.x.x" or "255.255.255.255" or something like that for those systems which may try to screen scrape (or use the XML output option).
Comment 12 Luca Toscano 2016-11-26 10:05:54 UTC
Created attachment 34479 [details]
Mod status PrivacyMode directive (directory context aware version)
Comment 13 Luca Toscano 2016-11-26 10:11:57 UTC
The patch attached creates a new directive called "PrivacyStatus" that is directory context aware (previous patch was only server level). I think that this would be good for admins that need to publish an external public facing server-status while being able to consult a private one showing IPs.

Still not worked on Jim's comment about replacing the IPs with x.x.x.x and not with a blank line.

1) Would we need to distinguish between IPv4/6? So something like x.x.x.x vs [x:x:x:x] or similar? I guess that probably this info falls under the "privacy" shield that we want to offer, but at the same time it might confuse users. 

2) Is there an XML output option for mod_status?
Comment 14 Luca Toscano 2017-01-30 12:40:00 UTC
After a chat with Humbedooh there might be a better way to do this using mod-lua. I am going to update this task and the documentation when the lua code will be published.