Summary: | DirectoryScanner does not efficiently handle filesystem loops | ||
---|---|---|---|
Product: | Ant | Reporter: | Michael Barker <mikeb01> |
Component: | Core | Assignee: | Ant Notifications List <notifications> |
Status: | NEW --- | ||
Severity: | normal | ||
Priority: | P2 | ||
Version: | 1.9.13 | ||
Target Milestone: | --- | ||
Hardware: | PC | ||
OS: | Linux | ||
Attachments: |
Patch to handle file system loop in directory scanner.
Patch to handle file system loop in directory scanner v2 Patch to handle file system loop in directory scanner v3 Patch to handle file system loop in directory scanner v4 |
Description
Michael Barker
2018-10-23 19:35:05 UTC
Created attachment 36206 [details]
Patch to handle file system loop in directory scanner.
This patch will skip over directories that create a file system loop. It does this by checking if the canonical path of a child directory is a prefix of the canonical path of the parent.
There is a test included.
The test also highlighted that the delete task will follow symlinks, which can also fall afoul of file system loops, so I've changed the delete task to only follow directories that are not symlinks and just delete the symlink itself instead.
I've attached a proposed patch for this issue. Created attachment 36210 [details]
Patch to handle file system loop in directory scanner v2
Updated patch that contains more useful log message.
Hello Michael, I haven't thoroughly reviewed the patch or checked the implication of the change. However, this change: --- src/main/org/apache/tools/ant/DirectoryScanner.java (date 1531197234000) +++ src/main/org/apache/tools/ant/DirectoryScanner.java (date 1540366425000) @@ -1258,6 +1258,18 @@ final String name = vpath + newFile; final TokenizedPath newPath = new TokenizedPath(path, newFile); final File file = new File(dir, newFile); + + try { + String fileCanonicalPath = file.getCanonicalPath(); + String dirCanonicalPath = dir.getCanonicalPath(); + if (dirCanonicalPath.startsWith(fileCanonicalPath)) { + continue; + } + } catch (IOException e) { + System.err.println("Failed to get canoncial paths to determine " + + "if filesyste loop exists, continuning"); + } + Could you instead try something like this (just pseudo-code, haven't checked for compile errors): try { if (FileUtils.getFileUtils().isLeadingPath(file.getAbsoluteFile(), dir.getAbsoluteFile(), true)) { continue; } } catch (IOException e) { // log it } The FileUtils#isLeadingPath with resolveSymlink = true would do the same as what you have attempted in the patch. Created attachment 36211 [details]
Patch to handle file system loop in directory scanner v3
Use FileSystemUtils.
Hello Michael, The v3 patch mostly looks fine to me. However, I'm not too sure about the changes you have in the Delete task where you no longer delete a directory if it is a symbolic link. I'll need inputs from Stefan and/or others on this one before merging it. The reason to change the delete task is that that unit test would get stuck in a loop when I added the cases where there was a file system loop. There are a couple of options that I could take with the delete task: 1) Use the same loop detection as in the Directory scanner an not follow links that create a file system loop. 2) Change the delete task to take a parameter of something like followSymLinks (defaulted to true) so that existing behaviour remains the default and the new behaviour is optional. My preference is for #1 as it requires less input from a user to prevent it from breaking. I'll have a look at code some of this up this weekend. Created attachment 36231 [details]
Patch to handle file system loop in directory scanner v4
This changes the delete task to similarly handle file system loops. It won't follow a symlink on delete if that symlink results in a file system loop. It will follow it in other cases.
Michael, I have now merged this patch (with minor changes that include change to the error messages and a comment to the changed code) to our 1.9.x and master branches. It seems to have introduced a failure in an existing test[1]. I'll look into it to see what needs to be done. I'll let this issue stay open until that test failure is sorted out. [1] https://builds.apache.org/job/Ant-Build-Matrix-1.9.x-Linux/OS=xenial,jdk=JDK%201.5%20(latest)/286/ The names from those tests suggest that the functionality I've added has changed the behaviour such that those tests are violated. We may need to remove those tests if we want to us my patch. Hello Michael, I have been away a bit, so couldn't respond earlier. Coming back to this, you mention OOM issue that initiated this patch. Would it be possible to attach a build file that reproduces this and also the exception stacktrace that you see (without the patches)? That way, we can see if this can be solved in a different manner. The current scenario we are running into: - We have a build with roughly the following structure. This is an artefact of the build tool we happen to be using (BUCK). <project root> <source directory> ... <source directory> <build output> <symlink -> project root> A <output directory> <symlink -> A> ... <output directory> <symlink -> A> The build has a total of around 1M files, of which ~800K are in the output directory. Note that we don't use Ant to build any of this. Where this hits us is that we use Jenkins for CI and the Ant DirectoryScanner is used find artefacts from the build to archive. When Jenkins runs the scanner is continues to recurse the project through the project root symlink. Because the DirectoryScanner holds onto all of the files it visits (not just the ones that match) it rapidly builds a very large list of files. Some heap debugging shows ~50M entries in the directory scanner before the Jenkins archive task starts to fail with OOM errors. I'll have a look later this week about trying to build a test case for this. I've fixed the problems caused by this issue elsewhere (Jenkins). Feel free to close. |