rotatelogs: Add -p option to call arbitrary post-rotate program. Post rotation, call the optional post-rotation program in a separate thread. apr_proc_create() is called with cmdtype APR_PROGRAM, so it has to be an executable file. Scripts are supported as long as the shebang line uses a working interpreter. The post-rotation program is run with equivalent arguments and environment variables to let users choose: argv[1]: Full path to current file argv[2]: Full path to previous file Env var ROTATELOGS_PATH_CUR: Full path to current file Env var ROTATELOGS_PATH_PREV: Full path to previous file Notes: The original motivation for this was to have a flexible callback to be able to maintain a current access.log symlink, and also do event-based (vs cron-based) compression and archival/upload to central systems. Thread support is required and assumed. It's not clear to me how important it would be to also support non-threaded systems. Adding a fork-based solution (APR_HAS_THREADS) can be done rather quickly, if needed.
Created attachment 27085 [details] Patch for rotatelogs.c and its documentation
Created attachment 27090 [details] Full updated patch for rotatelogs.c and its documentation Updated patch (full patch attached, applies to svn 1129082), supports forking and threading, depending on apr build support. Tested on Debian 6, Linux 2.6.32-amd64, building APR with and without thread support, then building httpd/support/. Successful results with both.
Adding minfrin, poirier, rjung to CC list, as you have been most recently active in the rotatelogs.c commit history. Is this patch something you could consider merging? Possibly also back-porting to 2.2, so it's easier to get pushed out to distros (I'm thinking of Debian).
Thanks for sending in the patch! I like the idea; I've had a requests for similar "do X post-rotate" functionality so there is some kind of general need here. A few comments: a) What's the purpose of the double-forking and thread support? apr_proc_create() will fork/exec internally, so why fork (/spawn thread) then call apr_proc_create()? b) Isn't the apr_tokenize_to_argv() usage going to split up any paths with spaces in? it should be simple enough to set up argv[] correctly without needing to flatten and reparse the string anyway. c) probably should set the cmdtype to APR_PROGRAM_ENV to pass through any env vars
Hi, Joe. a) This started out as a separate thread (to avoid many processes in the pid list) that calls the external program so that the thread would wait for the program to finish, and catch any problems. I have no issues with changing this, though. Would it make sense to simply fork the external program without looking back, and let it handle its own errors? Its stderr is still directed to the error log, presumably, so it's easy to catch problems anyway. b) As far as I can see in the source, apr_tokenize_to_argv() does the right thing with quoting and escaping. It requires users to put escaped spaces or escaped quotes in the apache config, though. Some examples in the docs would probably do the trick. c) APR_PROGRAM was chosen since the env arg to apr_proc_create() is ignored if the type is APR_PROGRAM_ENV. If we'd rather use apr_env_set() to set the vars, then use APR_PROGRAM_ENV, that should take care of it. Let me know what you think, and I'll make the changes.
Created attachment 27172 [details] simplified patch
Created attachment 27173 [details] second attempt at simplified patch
Sorry that I never followed up before. I've attached a patch which has some simplification in: 1) it is necessary to wait() for the forked children at some point, otherwise we'd collect an ever-increasing horde of zombies (defunct processes). What I've tried here is: a) not using detached processes b) do just "fire and forget" c) collect any zombies before each new post-rotate We could still log the exitcode/reason when collecting the zombies I guess, for diagnostics... not sure. 2) using the tokenize does seem more work than necessary, e.g. if the log filenames had spaces in we'd need to quote to get tokenize to work. have switched to set up argv[] directly 3) seems simpler to drop the env vars completely; if the spawned process wants to use env vars it can do that based on argv What do you think? It seems to work from some brief testing, particularly with a post-rotate program which just does "sleep". I'd be happy to commit as-is; more review/testing would be great.
Thanks, this looks good! I've done a bit of testing, and have some notes: 1. The process management works well. I've tried: a) postrotate program taking longer (sleep) than the specified rotation interval. This of course builds up children, but works exactly as expected. b) Apache/rotatelogs exiting before postrotate program finishes: Child is adopted by init, and continues to run until it completes. 2. Some might be surprised by or dislike the defunct child processes in the process list. I don't mind them at all. A fix could be select/poll, which would also handle exit codes, etc, but I wouldn't say that's a blocker for committing. 3. The usage() output, the manpage and the xml program documentation should not refer to the env variables any more. 5. There's some stray trailing whitespace in your latest diff, lines 108 and 122. Would it be at all possible to also commit this to 2.2? My ultimate goal is to have this included in Debian, hopefully in a Squeeze update (as opposed to Wheezy, which could be years in the future).
Thanks a lot! I made the whitespace and docs changes as suggested, and also fixed the logic so that the program only gets executed after a rotation, not after the initial open() of the file or after a failed attempt to rotate; I presume that was not the intend? (there were cases where the program would omit the second filename arg) (BTW, If you let me know your real name I can put that in CHANGES!) I can propose this for backport to 2.2.x.
It was actually intended to have the program execute on the initial open, with the goal of not necessarily only doing post-rotate work, but also do stuff with the newly created file. The main idea was to make a symlink from the new file to access-current.log or similar (like the -L arg does, although with symlinks). It would be up to the program to verify that the second arg (previous file) is present. Real name: Sven Ulland
OK, that makes sense - the docs didn't mention this. I've updated docs + code to work that way in r1140099. Thanks!
Joe, how did the proposal for backporting to 2.2 go?
Any chance of still getting this backported to 2.2?