Bug 49492

Summary: The fileset resource collection doesn't handle file/directory names with spaces
Product: Ant Reporter: Stefan Goor <sgoor>
Component: CoreAssignee: Ant Notifications List <notifications>
Status: NEW ---    
Severity: normal CC: john.elion
Priority: P2    
Version: 1.8.2   
Target Milestone: ---   
Hardware: PC   
OS: Windows XP   
Attachments: Proposed patch - adds "sep" attribute
FileList

Description Stefan Goor 2010-06-23 10:18:56 UTC
The org.apache.tools.ant.types.FileList collection doesn't recognise the files correctly when they contain spaces (windows os) because white space is hard-coded as a delimiter in the setFiles() method:

[code]
public void setFiles(String filenames) {
        checkAttributesAllowed();
        if (filenames != null && filenames.length() > 0) {
            StringTokenizer tok = new StringTokenizer(
                filenames, ", \t\n\r\f", false);
            while (tok.hasMoreTokens()) {
               this.filenames.addElement(tok.nextToken());
            }
        }
    }
[/code]

I have a comma separated list of files with spaces in the directory names and it would not work until I used the short dos name format.

I thought I would try to create a custom task that subclasses the FileSet and overrides setFiles() but the filenames Vector member variable is private and has no access in subclasses.

Could the FileSet be updated so that the delimiter can be specified.  I think this could be done simply by using a member variable and getter and setter for the delimiter, the default value of the variable could just be set to ", \t\n\r\f" to ensure backward compatibility.

I also wonder should the filenames Vector member variable have a protected getter method to allow sub-classing?

Thanks,
Stefan
Comment 1 Stefan Bodewig 2010-06-23 10:30:46 UTC
can you turn your list into a list separated by newlines?  If so,
<resourcelist> would solve the problem.
Comment 2 Stefan Goor 2010-06-24 05:46:09 UTC
(In reply to comment #1)
> can you turn your list into a list separated by newlines?  If so,
> <resourcelist> would solve the problem.

I don't think that using a new line to separate the files will work because the spaces will still be used to delimit the files, so it will break up the files based on spaces and new lines.

The <resourcelist> does allow me to do what I need but I need to have new files for the values rather than just putting them in a properties file.  It is an acceptable work around for me at the moment.

Would it be worthwhile making the changes I suggested above to make the filelist type more reusable?  I don't think these changes would impact the type in any negative way.
Comment 3 Stefan Bodewig 2010-06-24 05:55:54 UTC
<resourcelist> reads the list of resources from a different resource.  This
different resource doesn't have to be a file, it can be a property or a string
resource for example.

So if foo is a newline separated list of file names

<resourcelist>
  <propertyresource name="foo"/>
</resourcelist>

should be the collection of those files.

Coming to think of it, you may even be possible to use a filterchain to replace
the commas with newlines and use your property as it is.

OTOH, I don't think it would be too hard to provide a patch for an optional
delimiter attribute on filelist.
Comment 4 Stefan Goor 2010-06-24 08:52:08 UTC
(In reply to comment #3)
> <resourcelist> reads the list of resources from a different resource.  This
> different resource doesn't have to be a file, it can be a property or a string
> resource for example.
> 
> So if foo is a newline separated list of file names
> 
> <resourcelist>
>   <propertyresource name="foo"/>
> </resourcelist>
> 
> should be the collection of those files.


That's great, I hadn't realised that was possible.  Thanks for the suggestion!
Comment 5 John Elion 2011-02-16 09:57:11 UTC
I am having a similar issue with <FILELIST> breaking apart filenames with spaces in the "files" string.  In addition to breaking apart the string, I am relying on <FILELIST> to resolve relative pathnames, so I'm not sure that the <RESOURCELIST> workaround will work for me.
Comment 6 John Elion 2011-02-17 08:57:46 UTC
Created attachment 26673 [details]
Proposed patch - adds "sep" attribute

Proposed fix attached.  Fix adds "sep" attribute to FileList, such that the default is the original string.

I was able to build Ant from source, but got errors with JUnit (it wasn't clear to me how to modify the source distribution so that the unit tests would run; but Ant built "out of the box" and the FileList code was crystal clear, the mod was easy to make).  Behavior without "sep" attribute was same, "sep" attribute was accepted and solved my problem.

I did not see any way to write code to insure that "sep" is processed regardless of whether it occurs before or after "files", but it seemed to get it right both ways (maybe the API guarantees processing order based on order of the methods within the class?)
Comment 7 John Elion 2011-02-17 09:00:52 UTC
Created attachment 26674 [details]
FileList

Looks like I inadvertently attached FileSet.java to previous comment.
Comment 8 Matt Benson 2011-02-17 10:47:25 UTC
(In reply to comment #5)
> I am having a similar issue with <FILELIST> breaking apart filenames with
> spaces in the "files" string.  In addition to breaking apart the string, I am
> relying on <FILELIST> to resolve relative pathnames, so I'm not sure that the
> <RESOURCELIST> workaround will work for me.

Firstly, you can use any resource inside resourcelist.  Now, I assume that what you are trying to pass to filelist@files is a property, else you'd just be using nested <file> elements, right?  So you have a <propertyresource>.  I assume you must be using commas for your token, so you could use a <tokens> resourcecollection to break up your propertyresource.  To add leading paths, you could add another level of indirection into the <tokens>:  a <concat> task/resourcecollection using a filterchain configured with a prefixlines filter.  Actually once you're using <concat> you can probably skip the <tokens> collection altogether and use it to tokenize your comma-delimited stuff *and* prefix the resulting lines, passing the <concat> resource directly to <resourcelist>.  I am thus 99% confident that a solution *can* be made to work using <resourcelist>.  I don't have any major antipathy towards making separators configurable in <filelist>; however it would introduce an inconsistency between filelist and fileset/files/dirset that is IMO unnecessary.  These other types are somewhat different than filelist due to the fact that they all deal with directory scanning and therefore support includesfile/excludesfile as a means of specifying patterns with embedded spaces.  Of course, knowing that <filelist> supports the nested <file> element for this purpose (as documented in the manual), an alternative could be to use antcontrib:for to iterate over your comma-delimited property, then use the <augment> task to add the nested <file> elements one at a time.

Please follow up on user@ant.apache.org if you need more information about any of the approaches I've outlined here.

Matt