Bug 66535 - FarmWarDeployer will fail to deploy a WAR file when maxvalidtime is less than the time it takes to transfer
Summary: FarmWarDeployer will fail to deploy a WAR file when maxvalidtime is less than...
Status: RESOLVED FIXED
Alias: None
Product: Tomcat 9
Classification: Unclassified
Component: Cluster (show other bugs)
Version: 9.0.x
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: -----
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-03-20 17:03 UTC by Alex West
Modified: 2023-03-23 11:34 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Alex West 2023-03-20 17:03:44 UTC
We have been tracking down an issue where a cluster using FarmWebDeployer would only deploy to some nodes, and on others the war file would be written to the tempdir, but would fail to deploy. 

The destination tempdir war file would eventually become 0 bytes on the failed nodes, with no log entries to explain why. 

After extensive testing, it was determined that maxValidTime is not a setting relevant to idle time or files left open but not being written to, but is actually a time limit on the amount of time a FileMessageFactory file can be open at all.

The file will be immediately unlinked and appear as 0 bytes in the tempdir location as soon as the maxvalidtime has run out, regardless of the fact that the deployer is actively writing to it.
Comment 1 Alex West 2023-03-20 17:08:50 UTC
Although this may be 'intended' behavior, the code would lead me to believe that the developers intended this to be a check on whether the file was 'valid' and whether it had been 'idle' for some time.

I assume this because of the terminology used in the code, which uses a function of isValid to call the check and even includes a reference to timeIdle. Although as far as I can see there is nothing actually considered to be 'idle' here. This is called by the main background process thread in its loop. 

    public boolean isValid() {
        if (maxValidTime > 0) {
            long timeNow = System.currentTimeMillis();
            int timeIdle = (int) ((timeNow - creationTime) / 1000L);
            if (timeIdle > maxValidTime) {
                cleanup();
                if (file.exists() && !file.delete()) {
                    log.warn(sm.getString("fileMessageFactory.deleteFail", file));
                }
                return false;
            }
        }
        return true;
    }

which is called by

    protected void removeInvalidFileFactories() {
        String[] fileNames = fileFactories.keySet().toArray(new String[0]);
        for (String fileName : fileNames) {
            FileMessageFactory factory = fileFactories.get(fileName);
            if (!factory.isValid()) {
                fileFactories.remove(fileName);
            }
        }
    }

which is called by

    public void backgroundProcess() {
        if (started) {
            if (watchEnabled) {
                count = (count + 1) % processDeployFrequency;
                if (count == 0) {
                    watcher.check();
                }
            }
            removeInvalidFileFactories();
        }

    }
Comment 2 Alex West 2023-03-20 17:11:38 UTC
If this is intended behavior, the documentation should be updated to explain that the configuration setting has a bearing on the amount of time it may take to transfer WAR files over a slow network, or many nodes, etc. That it is NOT an idle timeout but a time limit on how long the temporary war file can exist, even while being written to.
Comment 3 Mark Thomas 2023-03-22 17:57:43 UTC
This is the commit that introduced maxValidTime

https://github.com/apache/tomcat/commit/4364cbc8d1f5cc6dbe9be0132d92e593ef67346c

Having looked at the commit, I think the intention could be taken to be either possibility. On balance, it does seem odd to remove the FileMessageFactory while messages are still being written. Therefore, I intend to look at the possibility of making this truly an idle time with the code and documentation updated/clarified accordingly.
Comment 4 Mark Thomas 2023-03-23 11:34:22 UTC
Fixed in:
- 11.0.x for 11.0.0-M5 onwards
- 10.1.x for 10.1.8 onwards
-  9.0.x for  9.0.74 onwards
-  8.5.x for  8.5.88 onwards