Summary: | Untar target does not handle long filenames in POSIX tar files | ||
---|---|---|---|
Product: | Ant | Reporter: | Peter Liljenberg <pliljenberg> |
Component: | Core | Assignee: | Ant Notifications List <notifications> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | bob, sagi.benakiva, vsizikov |
Priority: | P2 | Keywords: | PatchAvailable |
Version: | 1.7.0 | ||
Target Milestone: | 1.9.0 | ||
Hardware: | All | ||
OS: | All | ||
Attachments: |
Testtar file
Proposed patch for this issue Patch to fix Posix prefix handling |
Description
Peter Liljenberg
2007-03-21 15:46:00 UTC
Created attachment 19769 [details]
Testtar file
Test tar file that will break the untar target
The problem is recreated by using the supplied tar file (test.tar) with the untar target. Created attachment 19770 [details]
Proposed patch for this issue
Proposed patch for this issue
I've provided a proposal for a patch to resolve the issue. Can someone verify that I haven't created some other bugs with this patch. Currently untar only supports gnu tar long file names. In order to also support the posix 2001 format for long file names a further check should be done on the header to verify it this is a file in such format. This is a pretty serious issue for us, since now Git uses POSIX tar format for the tarballs, and this makes it impossible to extract the content of such tarballs properly via Ant means. You could try and use the supplied patch, it did work for me. It's not 100% tested or verified, but solved the troubles with long filenames for me. Not using it anymore though since we migrated to Maven. Documenting a bug is not fixing it. I did not see the note that subtly pointed out that extracting ant with standard tar WILL FAIL. This caused me unnecessary wasted time. I have NEVER seen a GNU or other Free Software project tolerate such sloppyness, only Microsoft. Recommended fix #1: Shorten pathnames to 100 characters. Recommended fix #2: Use nested tar files with the "inside" tar archives having files relative to higher directories. In other words, for the file long1/long2/long3.java, have the main "top level" tar file have the file long1/long2/short.tar with short.tar having files relative to long1/long2, such as just long3.java. Then, as part of the build procedure do "cd long1/long2;tar -xf short.tar". Recommended fix #3 (and least desirable): Have the ./configure test for the existence of one of the long file names. If it does not exist (and maybe even test for the existence of the name truncated to 100 characters). If the long file name does not exist then the configure should fail with an explanation. This should be trivial to add. (In reply to comment #8) > Documenting a bug is not fixing it. I did not see the note that subtly pointed > out that extracting ant with standard tar WILL FAIL. This caused me > unnecessary wasted time. I have NEVER seen a GNU or other Free Software > project tolerate such sloppyness, only Microsoft. Your problem is with extracting Ant itself? That actually isn't related to this issue. For what it's worth, though, poor spelling might itself be taken as a sign of "sloppiness," as might failure to read directions. Further, your invocation of the holy name of GNU leads me to point out that the issue in question being with GNU tar formats, it's obvious "plain" tar didn't satisfy that organization either. > > Recommended fix #1: > Shorten pathnames to 100 characters. That's like saying "Some cars are small. I'll cut off my head so I can fit into one of these." You wouldn't do that; you'd just use a car into which you can fit. > > Recommended fix #2: > Use nested tar files with the "inside" tar archives having files relative > to higher directories. In other words, for the file long1/long2/long3.java, > have the main "top level" tar file have the file long1/long2/short.tar with > short.tar having files relative to long1/long2, such as just long3.java. > Then, as part of the build procedure do "cd long1/long2;tar -xf short.tar". There really isn't a build procedure, per se. Extract and go. > > Recommended fix #3 (and least desirable): > Have the ./configure test for the existence of one of the long file names. > If it does not exist (and maybe even test for the existence of the name > truncated to 100 characters). If the long file name does not exist then > the configure should fail with an explanation. This should be trivial to > add. > Once again, there is no configure script shipped with Ant, nor is there a makefile. You DO know what project this is, right? When you work on solaris 10 a posix compliant tar file would be created for entries longer (or equal) 100 chars (solaris special). When you process such a file with ant, the file will be extracted into the root which is definitely wrong and the prefix part will be ignored. So I strongly recommend a fix to deal with this issue. To make sure this is done only with posix compliant tar archives you need to check whether the ustar followed by a zero byte marker is present. Then also the check supplied with the patch should be sufficient. A modified the TarEntry.java as follows at the end of the function I added ... public void parseTarHeader(byte[] header) { ... original code here boolean ustarFormat = false; // // NOTE Recognize archive header format. // if ( header[257] == 'u' && header[258] == 's' && header[259] == 't' && header[260] == 'a' && header[261] == 'r' && header[262] == 0 ) { ustarFormat = true; } /* if */ if (ustarFormat && header[offset] != 0) { offset += DEVLEN; StringBuffer buf = new StringBuffer(156); buf = TarUtils.parseName(header, offset, 155); buf.append('/'); buf.append(name); name = buf; } /* if */ This was fixed in Commons Compress some while ago - see https://issues.apache.org/jira/browse/COMPRESS-110 [Note that WinZip 9.0 also has the same issue; 7-Zip does not] Created attachment 27419 [details]
Patch to fix Posix prefix handling
Are you sure POSIX longfile support in Commons Compress is complete? If it is, then using the Compress Antlib with Commons Compress 1.2 will work. Is that why ant fails to extract this file correctly? http://git.eclipse.org/c/jetty/org.eclipse.jetty.project.git/snapshot/jetty-8.1.4.v20120524.tar.bz2 (In reply to comment #14) > Is that why ant fails to extract this file correctly? > http://git.eclipse.org/c/jetty/org.eclipse.jetty.project.git/snapshot/jetty- > 8.1.4.v20120524.tar.bz2 yes fixed with svn revision 1350857 by merging Commons Compress' (1.4.1) code into Ant Hi, I'm suffering from a similar problem, but in my case the file with the long name is not a regular file but a soft link. In my case the link name is very long and not the link full path (as in the Testtar file). I looked at tar source code and I think that the solution for this issue is not complete. Beside the definition for GNUTYPE_LONGNAME, there's a definition for GNUTYPE_LONGLINK, i.e. (from tar.h) : /* Identifies the *next* file on the tape as having a long linkname. */ #define GNUTYPE_LONGLINK 'K' in my testcase the function TarEntry::isGNULongNameEntry returns FALSE because linkFlag != LF_GNUTYPE_LONGNAME (linkFlag == (byte)'K') I even noticed that there's no definition for LF_GNUTYPE_LONGLINK in TarConstants.java Thank you, Sagi. |