Bug 51026

Summary: Ant fails to read tar archives containing a directory entry of size > 0
Product: Ant Reporter: Sandeep Tamhankar <stamhankar>
Component: OtherAssignee: Ant Notifications List <notifications>
Status: NEW ---    
Severity: normal    
Priority: P2    
Version: 1.7.1   
Target Milestone: ---   
Hardware: PC   
OS: All   

Description Sandeep Tamhankar 2011-04-05 18:45:59 UTC
Steps to repro (written ad-hoc because the problem is very easy to describe): 

File f = new File("c:/temp");
TarEntry x = new TarEntry(f);
System.out.println(x.getSize());

Expected result:
0

Actual result:
Sometimes 0, sometimes 4096.

The reason is that File.length() behavior is undefined for directories.

Consequence: you create a tar-ball with some directory tar-entries with a size of 4096. Then if you try to untar via TarInputStream, getNextEntry() actually skips 4k into the buffer to get to the next entry, when really the next entry is 0 bytes away (or something like that).

Why this doesn't affect ordinary mortals:
tar'ing something incorrectly like this is fine with GNU tar for extraction; I'm guessing that gnu tar ignores the size field for directory tar entries.

Analogously, TarInputStream / TarEntry work fine when reading a properly generated tar-ball because a proper tar-ball won't have the 4k size value in tar-entries. One could argue that the fix should be in the TarEntry(File) constructor as well as TarInputStream.getNextEntry() so that the producer creates good tars and the consumer is tolerant of slightly bad tars.
Comment 1 Stefan Bodewig 2011-04-12 00:13:37 UTC
The TarEntry constructor has been fixed about two years ago with svn revision 755231 - this is part of Ant 1.8.0.

Since you seem to be using Ant's tar classes as a library let me suggest you take a look at Apache Commons Compress http://commons.apache.org/compress/ (which doesn't read tar archives with directories of a size > 0 properly either).
Comment 2 Sandeep Tamhankar 2011-04-12 17:50:09 UTC
Umm... I don't think the TarEntry constructor has been fixed. When I look at TarEntry.java from the svn link you posted, it clearly shows on line 223: this.size = file.length();

Also, I tried using Ant 1.8.1 before filing the issue and managed to repro it there as well... although on further thought I may have only tested the untar'ing side of things (e.g. untar a tar-ball containing some dir entries with 4k size).

So this is still an open issue.

As for commons-compress: it's awesome that this logic has essentially been factored into a commons library like this. I will definitely move over to that. Thanks a ton for the tip!

(In reply to comment #1)
> The TarEntry constructor has been fixed about two years ago with svn revision
> 755231 - this is part of Ant 1.8.0.
> 
> Since you seem to be using Ant's tar classes as a library let me suggest you
> take a look at Apache Commons Compress http://commons.apache.org/compress/
> (which doesn't read tar archives with directories of a size > 0 properly
> either).
Comment 3 Sandeep Tamhankar 2011-04-12 19:33:04 UTC
I just downloaded commons-compress, and I can see that the TarArchiveEntry(File, String) constructor does the right thing, so I don't have to work around this issue (or fix it in Ant) at this point.

Ant should use commons-compress rather than keeping its current impl.

(In reply to comment #2)
> Umm... I don't think the TarEntry constructor has been fixed. When I look at
> TarEntry.java from the svn link you posted, it clearly shows on line 223:
> this.size = file.length();
> 
> Also, I tried using Ant 1.8.1 before filing the issue and managed to repro it
> there as well... although on further thought I may have only tested the
> untar'ing side of things (e.g. untar a tar-ball containing some dir entries
> with 4k size).
> 
> So this is still an open issue.
> 
> As for commons-compress: it's awesome that this logic has essentially been
> factored into a commons library like this. I will definitely move over to that.
> Thanks a ton for the tip!
> 
> (In reply to comment #1)
> > The TarEntry constructor has been fixed about two years ago with svn revision
> > 755231 - this is part of Ant 1.8.0.
> > 
> > Since you seem to be using Ant's tar classes as a library let me suggest you
> > take a look at Apache Commons Compress http://commons.apache.org/compress/
> > (which doesn't read tar archives with directories of a size > 0 properly
> > either).
Comment 4 Stefan Bodewig 2011-04-20 01:19:55 UTC
(In reply to comment #2)
> Umm... I don't think the TarEntry constructor has been fixed. When I look at
> TarEntry.java from the svn link you posted, it clearly shows on line 223:
> this.size = file.length();

My fault, I gave you the wrong end of the revision diff I was looking at, the
constructor has been fixed in svn revision 755473

> Also, I tried using Ant 1.8.1 before filing the issue and managed to repro it
> there as well... although on further thought I may have only tested the
> untar'ing side of things (e.g. untar a tar-ball containing some dir entries
> with 4k size).

The untaring side hasn't been adapted (and I don't think commons-compress
deals with it either.

(In reply to comment #3)

> Ant should use commons-compress rather than keeping its current impl.

http://ant.apache.org/antlibs/compress/

For Ant core we like to keep the number of our dependencies as small as possible.