Bug 1550

Summary: delete task follow symbolic links
Product: Ant Reporter: michele.quaini
Component: Core tasksAssignee: Ant Notifications List <notifications>
Status: RESOLVED FIXED    
Severity: enhancement CC: colby.gooding, notifications
Priority: P1    
Version: 1.3   
Target Milestone: ---   
Hardware: Sun   
OS: other   
Attachments: [PATCH] Deletion of symlinks does not result in deletion of file or dir it links to. Just the symbolic link is deleted - Thanks, Magesh

Description michele.quaini 2001-04-27 00:50:36 UTC
i made a simbolic lynk under the directory webapps/<project>, 
containg some files to be displayed

> cd $TOMCAT_HOME/webapps/<project>
> ln -s /usr/local/.../invoices invoices

Executing the following target clean:

> build clean


it followed the symbolic link, deleting the entire directory content.

content of build.xml:


...
  <target name="clean">
    <delete dir="${deploy.home}"/>
  </target>
...
Comment 1 michele.quaini 2001-04-27 00:52:23 UTC
*** Bug 1551 has been marked as a duplicate of this bug. ***
Comment 2 Stefan Bodewig 2001-07-31 03:32:51 UTC
Right now I cannot see a clean way for a Java program to identify a File instance
to be a symbolic link - so there is not much Ant can do here, that's why I've
degraded it to an Enhancement.
Comparing absolutePath and canonicalPath will lead to false positives on case
insensitive file systems or systems with uncommon file name conventions.
Comment 3 Magesh Umasankar 2001-09-17 07:22:59 UTC
>Comparing absolutePath and canonicalPath will lead to false positives on case
>insensitive file systems or systems with uncommon file name conventions.

Please enumerate the systems you are concerned about.  Perhaps we could write 
code such that it ignores case on such systems.

Thanks,
Magesh
Comment 4 Magesh Umasankar 2001-09-18 10:05:46 UTC
Windows and Mac are the filesystems that I know of where case of file names do 
not matter.  I will send a patch soon that identifies symbolic links using 
canonicalpath and absolutepath and deletes just the symlink instead of the 
file/dir it links to.  Using Java, deleting a symlink is a bit round-about, but 
very much possible.

If there are any other OSs other than Mac and Windows (that are case-
insensitive), please modify the isCaseSensitive method accordingly and that 
will be all there is to it.

Thanks,
Magesh
Comment 5 Magesh Umasankar 2001-09-18 10:08:55 UTC
Created attachment 575 [details]
[PATCH] Deletion of symlinks does not result in deletion of file or dir it links to.  Just the symbolic link is deleted - Thanks, Magesh
Comment 6 Stefan Bodewig 2001-09-27 02:55:22 UTC
Additional operating systems where the default filesystem is case insensitive
include OS/2 and OpenVMS (not sure about OS/390).  My main concern is that it is
less dependent on the operating system but depends on the filesystem itself.  

HFS+ devices under MacOS X, VFAT mounts under Linux (or any other Unix like
system mounting local "Windows drives" on the i386 platform), SMB or NFS mounts
of filesystems residing on a server that runs a case insensitive filesystem and
so on will not be case sensitive although the operating system would indicate
the filesystems are sensitive.

The opposite effect would be a Windows client accessing an NFS or Samba exported
filesystem from a Unix box.

I have no idea what the canonical path will be - whether Java will just look at
the operating system, but I see too many "we don't really know" stuff here -
we shouldn't rush and put it into Ant 1.4.1 IMHO, let's patch the 1.5 branch
and collect some experience with it.
Comment 7 Magesh Umasankar 2001-09-27 09:06:35 UTC
With respect to SMB:
Let us say we have a file c1 and a symlink to that, say, c2.  Both these files 
are in a Unix/Linux system.

Using SMB, this file becomes available to a Windows machine.  In this case, the 
getCanonicalPath and getAbsolutePath that gets returned by the Windows JVM are 
one and the same.  In fact, if we issue File.delete on c2, just c2 will be 
deleted.  The JVM will not be intelligent enough to resolve the symlink in this 
case.  In other words, when Samba is used, and a delete is attempted from 
Windows, the delete task will *not* follow symlinks - it won't even know it is 
a symlink - it is just another file as far as Windows is concerned.

+1 to patch it against 1.5 branch and get user feedback.  I don't have access 
to various systems you have mentioned (OS-X, OS/2, OpenVMS) and I have to rely 
on the feedback to include additional criteria.  

Thanks,
Magesh
Comment 8 Magesh Umasankar 2001-10-15 08:27:16 UTC
Some wider problems when using Samba:
Assume I am connecting to a Unix file system from Win using samba.
Assume there are two hard files myfile (size 3) and MYFILE (size 5)
All of Ant's current tasks will pick the wrong file when executed from windows.
For example:

<delete file="h:/MYFILE"/>

will actually end up deleting myfile  

So, there isn't much we can do with Samba case-sensitive files as it is really a
Java file handling problem.

Just wanted to share this so that we are aware.

Thanks,
Magesh
Comment 9 Magesh Umasankar 2001-10-15 11:12:43 UTC
On Mac OS-X, comparing canonicalpath and absolutepath leads to false positives
when the directory is aliased.  However, case-sensitivity doesn't seem to be an
issue here - it behaves like linux/unix.
Comment 10 Steve Loughran 2001-10-15 13:19:56 UTC
On the subject of case sensitivity, clearcase on windows is a sporadically case 
sensitive filesys, to the extent that "includes=*.ttf" seems not to match 
against "UPPERCASE.TTF". Hence code containing fragments like 	
	<fileset dir="${dir.fonts}" includes="*.ttf,*.TTF" />
I dont know how it behaves vis-a-vis symlinks.

Comment 11 Magesh Umasankar 2001-10-18 17:21:59 UTC
*** Bug 4281 has been marked as a duplicate of this bug. ***
Comment 12 Mark Nelson 2001-12-24 12:44:30 UTC
This problem is not just with the delete task.  Suppose you want to copy a 
filesystem that contains symbolic links.  The JVM will happily follow the 
symbolic links rather than copy the links themselves.

For a specific example, we host a large website.  The top-level pages have 
symbolic links that Apache follows to the content.  We can't use Ant to copy 
the pages from our source to destination because Ant will follow the links and 
copy the content.

We get around this problem by using the <exec> task and calling tar/untar to 
perform the copy.

Fundamentally, the problem is that Java has little or no knowledge of symbolic 
links.  Unfortunately, this won't get into the 1.4 JDK.
Comment 13 Jim White 2001-12-24 13:06:27 UTC
I don't think it makes sense for Ant to introduce some special case behavior 
for symlinks for a few special commands.  That will just lead to confusion 
about how Ant treats symlinks since it will be trying to not simply do what 
Java does with them (which is to see them as the actual file or directory).

As has been mentioned, this is something that will eventually be addressed by 
the JVM:
http://developer.java.sun.com/developer/bugParade/bugs/4313887.html

That said, if Ant does try to do anything special with regard to symlinks, 
which does seem useful in the delete task, it should be special behavior only 
activated with an attribute that is normally false.  Among other things, that 
will ensure backward compatibility.
Comment 14 Gus Heck 2002-05-02 16:38:50 UTC
So to summarize this bug and propose a general solution:

1. Symlinks on linux can cause annoying, and even potentially dangerous
behavior, particularly with delete and copy type tasks. Understandably users
on these systems might want to write build files that guard against a global
delete of **/*.foo files (such as a accidental symlink to /usr that goes
unnoticed and someone being dumb and forgetting they are su root when doing a
build clean) (*shudder*). (most problems are likely to be much smaller, but
still potentially annoying)

2. Java doesn't understand symlinks very well (at all?), and the mechanisims
for identifying them absolutely in all cases are simply not available.
However, they do fall out nicely on certain file systems comparing absolute
and canonical paths. Our problem is that we don't know what filesystems are
going to be included in a FileSet and there could be more than one.

3. The only person who knows what filesystems are involved is the user who
writes the fileset. (hopefully) Thus the reasonable thing to do is to extend
the functionality that _is_ offered by Java to the user, and let him/her
apply it as needed.

Comment 15 Magesh Umasankar 2002-05-02 16:44:06 UTC
Comparing canonical and absolute paths will fail when aliases are used on unix/
linux systems as well.  It is not very dependable to do this, as I have learnt.
In other words, it will lead to false symlink positives.
Comment 16 Gus Heck 2002-05-02 17:11:46 UTC
I am not sure how one would use an alias to represent a file within a 
filesystem that would be scanned by DirectoryScanner, but for the sake of 
argument, aliases might get lumped with symlinks. The thrust of my solution, is 
to approach this by letting the user define their filesets as canonical. I 
would expec that aliases have some of the same problems as symlinks in terms of 
unexpected behavior and gotchas. 

So if the user wants to follow symlinks and aliases, they write their filesets 
as before. If they don't want to follow them (or anything else that isn't 
canonical) then they can set isCanonical = true. If they want to follow aliases 
but not symlinks they are out of luck, but at least they have the choice of 
canonical or not. 

Documentation should discuss what is and isn't canonical of course. 
Comment 17 Gus Heck 2002-05-08 22:21:15 UTC
Is there some non-canonical path that can be returned by File.list() other than 
a symlink?

You may need to forgive my ignorance, but I don't see how aliases are a problem.

Earlier in this bug it was mentioned that when aliases are used in linux they 
may also lead to non-canonical paths (in addition to symlinks) I have done a 
little looking around (java in a Nutshell, and talking to some friends) and the 
only place I find mention of aliases in linux is shell aliases. So what I am 
wondering is, whether or not there is any way in which aliases would effect the 
results of a scan by directory scanner. 

I am not even sure of how Java can get aliases expanded by the shell when 
working with files in the first place.

DirectoryScanner compares the results of a File.list() call to the patterns 
supplied. How is it possible for File.list() to return something that is 
effected by or contains a shell alias. 

If someone could breifly explain how aliases can yeild a false positive when 
testing the output of File.list() for symlinks with a comparison of absolute vs 
canonical, it might greatly aid my thinking on this problem. I suspect that 
either I am completely unaware of some facet of the problem, or it isn't a 
problem because of the source of the filenames we are working with.

If it is the latter I would like to proceed with reworking my patch to allow 
filesets to ignore non-canonical paths, which would enable users on linux to 
avoid following symlinks if this was a problem for them, and resolving the core 
of this bug. (The issue of how to safely get rid of the links left behind when 
using delete and a fileset that ignores non-canononical paths still exists, but 
one thing at a time)
Comment 18 Magesh Umasankar 2002-05-08 22:49:35 UTC
-rw-r--r--   1 umagesh  staff   286 May  8 15:44 test.java

Absolute Path: /Users/umagesh/test.java
Canonical Path: /Volumes/data/Users/umagesh/test.java

As Absolute Path and Canonical Path do not match here, it would be treated as
a symbolic link, leading to a false positive.

The above example is on MacOS-X
Comment 19 Magesh Umasankar 2002-05-08 22:52:18 UTC
Apologies:
lrwxrwxr-t   1 root  admin       19 Jan 28 18:30 Users -> /Volumes/data/Users

The base directory was a link after all.
Comment 20 Gus Heck 2002-05-09 01:22:53 UTC
Also the code I wrote would have detected it as a real file anyway because I 
build and test my file object like this:

tempdir = new File(dir.getCanonicalPath());
tester = new File(tempdir, newfiles[i]);
can = tester.getCanonicalPath();
abs = tester.getAbsolutePath();
if (can.equals(abs)) {
    noLinks.add(newfiles[i]);
}

I only flag it as a symlink if the last element in the path is non-canonical.
The rest of directory scanner never sees anything that fails that test.
Comment 21 Stefan Bodewig 2002-05-10 06:31:35 UTC
Magesh,

I've tried Gus' code in this situation already and it handles it.  The code will
resolve the file name (test.java) afainst the canonical path of the parent
directory and compare the absolute path with the canonical path of this newly
resolved file.  Up to now I couldn't get it to create false positives on Linux
and/or FreeBSD.
Comment 22 Magesh Umasankar 2002-05-10 08:28:47 UTC
Awesome then!  Let's get it in.
Comment 23 Stefan Bodewig 2002-05-10 12:54:07 UTC
<fileset> will have a new followsymlink attribute that can be set to false to
avoid the deletion of symbolic links.