Bug 40066 - RollingFileAppender rollover does not work when gaps are present in the files
Summary: RollingFileAppender rollover does not work when gaps are present in the files
Status: RESOLVED WORKSFORME
Alias: None
Product: Log4j - Now in Jira
Classification: Unclassified
Component: Appender (show other bugs)
Version: 1.2
Hardware: Other Windows XP
: P2 major
Target Milestone: ---
Assignee: log4j-dev
URL:
Keywords: PatchAvailable
Depends on:
Blocks:
 
Reported: 2006-07-18 14:02 UTC by Yves Debillo
Modified: 2007-03-04 23:38 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Yves Debillo 2006-07-18 14:02:45 UTC
On a machine under heavy load, and with restricted rights on the log-directory,
it can happen that the rollover does not work for a number files.  Setting to 10
files, I sometimes miss e.g. file.3, file.4, file.5 ...
This seems to be related with the way the rollover method is implemented.

I added a test before renaming a file, to make sure that a temporary lock during
the previous operation does not cause the next operation to fail. 
As a result, the previous test on the last file can be removed. 
With this change, the problem is less frequent so I think it could be a good
improvement.

  public // synchronization not necessary since doAppend is alreasy synched
  void rollOver() {
    File target;
    File file;

    LogLog.debug("rolling over count=" + ((CountingQuietWriter) qw).getCount());
    LogLog.debug("maxBackupIndex="+maxBackupIndex);

    // If maxBackups <= 0, then there is no file renaming to be done.
    if(maxBackupIndex > 0) {
      // lines removed here .......

      // Map {(maxBackupIndex - 1), ..., 2, 1} to {maxBackupIndex, ..., 3, 2}
      for (int i = maxBackupIndex - 1; i >= 1; i--) {
	file = new File(fileName + "." + i);
	if (file.exists()) {
	  target = new File(fileName + '.' + (i + 1));
	  LogLog.debug("Renaming file " + file + " to " + target);
          // lines added here ........
	  if (target.exists())
	    // Delete the destination file, to keep Windows happy.
	    target.delete();
          // end lines added .......
	  file.renameTo(target);
	}
      }

      // Rename fileName to fileName.1
      target = new File(fileName + "." + 1);

      this.closeFile(); // keep windows happy.

      file = new File(fileName);
      LogLog.debug("Renaming file " + file + " to " + target);
      file.renameTo(target);
    }

    try {
      // This will also close the file. This is OK since multiple
      // close operations are safe.
      this.setFile(fileName, false, bufferedIO, bufferSize);
    }
    catch(IOException e) {
      LogLog.error("setFile("+fileName+", false) call failed.", e);
    }
  }
Comment 1 Curt Arnold 2007-03-01 23:10:31 UTC
Possibly duplicate of 41735.  Could you test and confirm?
Comment 2 Yves Debillo 2007-03-02 01:36:13 UTC
(In reply to comment #1)
> Possibly duplicate of 41735.  Could you test and confirm?

It does seem to be closely related enough and could be closed.
I do not agree however with the suggested change for 41735 and do believe that 
my suggested change could still be an improvement.

I will not update 47135 as it is already marked resolved and put my comments 
here:

Problem 1: you are deleting the last index when you start rolling over.  If 
lastindex is e.g. .10, and you have no .9, there is no reason to delete .10.  
The test for deletion of the last file should be placed when you need this 
name for putting the .9 file only.  That is why I removed these lines and 
added the test for the other renames.

Problem 2: for all renames, you are not testing if the target file exists, as 
such a rename will fail when the file has e.g. read-only marks or is open in a 
non-exclusive mode.  First deleting it avoid pending locks.  e.g. virus 
scanner might be scanning the file but without placing a lock

PS: The problem I reported is that the files are not being locked by the 
system because no other process is accessing the files.  It could be that 
writing to the files occurs concurrently from different threads in the 
application all using the same logging sub-system.
Comment 3 Curt Arnold 2007-03-02 11:21:03 UTC
(In reply to comment #2)
>It does seem to be closely related enough and could be closed.
>I do not agree however with the suggested change for 41735 and do believe that 
>my suggested change could still be an improvement.

I'd only close this bug as duplicate of 41735 if the fix for 41735 fixed your problem.  On first read, I 
was interpreting your problem as that you were losing logging data due to heavy load, not that there 
were gaps in the file names.  I think the fix for 41735 may eliminate data loss where a rename or delete 
fails due to some side effect of a heavy load since it will stop subsequent renames and closing of the 
active log file.  However, it could still result in a gap of file names above the locked file which should 
only be a minor cosmetic defect.

If you see a defect in the fix for 47135 (svn diff -r513637:513638) where it does not address the issues 
raised in 47135 or introduces an undesirable side effect, please comment on the bug report for 41735 
and reopen it if you think it is justified.  If that patch doesn't fix the problem is this bug report, then we 
need to keep this bug open and figure out what is going on in your case.  If an eventual patch for this 
bug ends up reverting part of the 47135 patch (while still keeping the behavior improvements), then 
that is fine.


> Problem 1: you are deleting the last index when you start rolling over.  If 
> lastindex is e.g. .10, and you have no .9, there is no reason to delete .10.  
> The test for deletion of the last file should be placed when you need this 
> name for putting the .9 file only.  That is why I removed these lines and 
> added the test for the other renames.

The bug 41735 patch tried to make the minimal changes necessary to eliminate the data loss problem.  
In particular, it tried to make the code path obviously identical to the existing code when the delete and 
renames succeeded.  I'm extremely hesitant to make any substantial change to a successful code path 
without a clearly defined and substantial benefit.  I don't think that filling in the gaps in a partial 
sequence is substantial enough. 

> Problem 2: for all renames, you are not testing if the target file exists, as 
> such a rename will fail when the file has e.g. read-only marks or is open in a 
> non-exclusive mode.  First deleting it avoid pending locks.  e.g. virus 
> scanner might be scanning the file but without placing a lock

The 41735 fix will only attempt to rename a file after having successfully renaming or deleting the 
target file if it had existed.  Another process or thread could recreate the target file, but then the 
rename would fail and you'd defer the rollover and end up with foreign target file in the train and a 
larger than normal file corresponding to the active log file.

> PS: The problem I reported is that the files are not being locked by the 
> system because no other process is accessing the files.  It could be that 
> writing to the files occurs concurrently from different threads in the 
> application all using the same logging sub-system.

There are inheritent suseptibilities in the current RFA's.  The MultiFileAppender effort tried to address 
the same use cases without need to do file renames or preemptive closings.  I'd like to get back to that 
and finish it off, but haven't had the time.  I'm hesitant to making any changes to the existing RFA's 
where the problem isn't clearly identified and the code changes are low-risk.  In this case, it isn't clear 
what the underlying problem is and the code modifications are not obviously low-risk.
Comment 4 Yves Debillo 2007-03-04 23:38:19 UTC
(In reply to comment #3)
I tested the change for bug 47135.  It is a substatial improvement.