|Summary:||ROBOTS META-Tag directive needed in mod_autoindex|
|Product:||Apache httpd-2||Reporter:||Richard Schaal <richschaal>|
|Component:||mod_autoindex||Assignee:||Apache HTTPD Bugs Mailing List <bugs>|
Adds robot directive to mod_autoindex.c
Configurable HEAD contents
Description Richard Schaal 2006-11-23 05:12:59 UTC
When Htdig is used to index a collection of documents in a directory tree, pages that contain the directory file list seem to flood the subsequent search results. A simple robot directive inserted into pages that are created for directories will cause the indexing robot to not index the directory, but read the files instead. This greatly improves the quality of the returning information. Here is the directive: <meta name="robots" content="noindex,follow"> It should be placed as follows in the head section: !DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> <html> <head> <title>Index of /~rds</title> <meta name="robots" content="noindex,follow"> </head> <body> <h1>Index of /~rds</h1> <table><tr><th><img src="/icons/blank.gif" alt="[ICO]"></th><th><a href="?C=N;O=D">Name</a></th><th><a href="?C=M;O=A">Last modified</a></th><th><a href="?C=S;O=A">Size</a></th><th><a href="?C=D;O=A">Description</a></th></tr><tr><th colspan="5"><hr></th></tr> <tr><td valign="top"><img src="/icons/back.gif" alt="[DIR]"></td><td><a href="/">Parent Directory</a></td><td> </td><td align="right"> - </td></tr> <tr><td valign="top"><img src="/icons/folder.gif" alt="[DIR]"></td><td><a href="Music/">Music/</a></td><td align="right">03-Sep-2006 10:52 </td><td align="right"> - </td></tr> .... I've used this tweak for a year or more without difficulty. It would certainly be nice to see this incorporated so I don't need to patch future releases of the server. Thanks! - Richard
Comment 1 Richard Schaal 2006-11-23 05:17:09 UTC
Created attachment 19165 [details] Adds robot directive to mod_autoindex.c
Comment 2 Nick Kew 2006-12-25 08:59:10 UTC
Created attachment 19304 [details] Configurable HEAD contents This should be configurable. Patch attached. Not sure about including it as standard.
Comment 3 D. Stussy 2007-02-01 19:35:49 UTC
I concur, but I came here not because my local HTDig engine indexed it, but google.com's search engine indexed the directory and returns it in search results. The "configurable HEAD contents" patch is the correct approach, but I do feel there should be an additional variable (called "IndexRobots") that contains the search engine directives - as this may be important enough a function to have its own name. If one were to go ONLY with the HEAD patch, then one would have to redefine the robots control string any time other information were inserted into the directory's head section; an easy step to forget. "IndexStyleSheet" is already mainstream....
Comment 4 D. Stussy 2007-05-29 19:53:06 UTC
More - I found this trick, but it's not a complete solution. URL: http://www.htdig.org/FAQ.html#q4.23 "The other technique you can use, if you want the directory index to be made by the web server, is to get the server to insert the robots meta tag into the index page it generates. In Apache, this is done using the HeaderName and IndexOptions directives in the directory's .htaccess file. For example: HeaderName .htrobots IndexOptions FancyIndexing SuppressHTMLPreamble and in the .htrobots file: <HTML><head> <META NAME="robots" CONTENT="noindex, follow"> <title>Index of /this/dir</title> </head> -- With this method, the title is NOT dynamic but fixed. If a fixed file with some sort of server-side processing that need not reside in the directory being displayed can be used, then there might be a valid and complete fix, but such seems like a nasty hack to me. The patch seems to be a better solution.
Comment 5 William A. Rowe Jr. 2018-11-07 21:09:39 UTC
Please help us to refine our list of open and current defects; this is a mass update of old and inactive Bugzilla reports which reflect user error, already resolved defects, and still-existing defects in httpd. As repeatedly announced, the Apache HTTP Server Project has discontinued all development and patch review of the 2.2.x series of releases. The final release 2.2.34 was published in July 2017, and no further evaluation of bug reports or security risks will be considered or published for 2.2.x releases. All reports older than 2.4.x have been updated to status RESOLVED/LATER; no further action is expected unless the report still applies to a current version of httpd. If your report represented a question or confusion about how to use an httpd feature, an unexpected server behavior, problems building or installing httpd, or working with an external component (a third party module, browser etc.) we ask you to start by bringing your question to the User Support and Discussion mailing list, see [https://httpd.apache.org/lists.html#http-users] for details. Include a link to this Bugzilla report for completeness with your question. If your report was clearly a defect in httpd or a feature request, we ask that you retest using a modern httpd release (2.4.33 or later) released in the past year. If it can be reproduced, please reopen this bug and change the Version field above to the httpd version you have reconfirmed with. Your help in identifying defects or enhancements still applicable to the current httpd server software release is greatly appreciated.