Issue 126469 - Failing checksums in ./bootstrap due to archive decompression
Summary: Failing checksums in ./bootstrap due to archive decompression
Status: RESOLVED FIXED
Alias: None
Product: Build Tools
Classification: Code
Component: external prerequisites (show other issues)
Version: 4.0.0
Hardware: All All
: P3 Normal (vote)
Target Milestone: 4.2.0
Assignee: j.nitschke
QA Contact:
URL:
Keywords:
: 123673 (view as issue list)
Depends on:
Blocks: 126765
  Show dependency tree
 
Reported: 2015-08-16 12:53 UTC by Andrea Pescetti
Modified: 2016-01-09 17:44 UTC (History)
1 user (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: 4.1.1
Developer Difficulty: Simple


Attachments
demonstrate issue with LWP::Simple::get (942 bytes, text/x-perl)
2015-12-30 16:43 UTC, j.nitschke
no flags Details
replace LWP::Simple with LWP::UserAgent for downloads (2.11 KB, patch)
2015-12-30 19:32 UTC, j.nitschke
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this issue.
Description Andrea Pescetti 2015-08-16 12:53:48 UTC
A typical ./bootstrap execution contains output like the following:

downloading to .../ext_sources/128cfc86ed5953e57fe0f5ae98b62c2e-libtextcat-2.2.tar.gz.part
    MD5 checksum does not match (df9618de4c358e9918a6fa3c6a9f2256 instead of 128cfc86ed5953e57fe0f5ae98b62c2e)
downloading to .../ext_sources/128cfc86ed5953e57fe0f5ae98b62c2e-libtextcat-2.2.tar.gz.part
MD5 checksum is OK

So:
1) Download is attempted from the first URL
2) Checksum does not match
3) Download is attempted from the second URL
4) Checksum matches, process continues

The reason for failure in 2) is that the file downloaded in 1), for some reason, is the already decompressed version of the one we want to download.

Check:
$ gunzip 128cfc86ed5953e57fe0f5ae98b62c2e-libtextcat-2.2.tar.gz 
$ md5sum 128cfc86ed5953e57fe0f5ae98b62c2e-libtextcat-2.2.tar
df9618de4c358e9918a6fa3c6a9f2256  128cfc86ed5953e57fe0f5ae98b62c2e-libtextcat-2.2.tar

(comparing the checksum it is clear that the file has already been decompressed).
Comment 1 j.nitschke 2015-12-30 16:38:20 UTC
had the same issue with dmms package and narrowed it down
https://github.com/apache/openoffice/blob/c667dd47ad47dfe33b5fb1af77fca185441b45b4/main/solenv/bin/download_external_dependencies.pl#L543
> my $content = LWP::Simple::get($URL);
this call returns plain text for some compressed packages

likely related to a change in the perl lib
https://github.com/gisle/libwww-perl/commit/06e3b04d44a6e4ede79f8c6cc75bfa6eb4d6bac1
> -    return $response->content if $response->is_success;
> +    return $response->decoded_content if $response->is_success;

we could emulate the old behaviour by using LWP::UserAgent
I'll attach a perl script to demonstrate the issue

when fixed some hashes need to be changed
so far only libtextcat and dmms
Comment 2 j.nitschke 2015-12-30 16:43:11 UTC
Created attachment 85228 [details]
demonstrate issue with LWP::Simple::get

script downloads libtextcat with LWP::Simple and LWP::UserAgent
GetSimple-... and GetAgent-...

typo in comment 1
mdds package not dmms
Comment 3 Andrea Pescetti 2015-12-30 16:51:20 UTC
I confirm that on my system I get:
$ file GetSimple-libtextcat-2.2.tar.gz 
GetSimple-libtextcat-2.2.tar.gz: tar archive
$ file GetAgent-libtextcat-2.2.tar.gz 
GetAgent-libtextcat-2.2.tar.gz: gzip compressed data

so indeed the GetAgent file is downloaded correctly, preserving compression.

Note that I don't understand why you believe that we'll need to change some hashes: all hashes now match with the fallback (ooo-extras) ones. Or at least this is what we expect.

Thanks for looking into this and it seems you are close to having a patch for download_external_dependencies.pl; I'll happily review your work if you have time for improving this!
Comment 4 j.nitschke 2015-12-30 17:07:12 UTC
(In reply to Andrea Pescetti from comment #3)
> Note that I don't understand why you believe that we'll need to change some
> hashes: all hashes now match with the fallback (ooo-extras) ones. Or at
> least this is what we expect.
indeed, didn't notice the first attempt was failing.
tried with http://kohei.us/files/mdds/src/mdds_1.0.0.tar.bz2
and assumed mdds_0.3.1.tar.bz2 from this location gets uncompressed too, but it's not on this site any longer
Comment 5 j.nitschke 2015-12-30 19:32:17 UTC
Created attachment 85234 [details]
replace LWP::Simple with LWP::UserAgent for downloads

here my patch
removed dead code from older download method

tested bootstrap with empty /ext_sources/
all 31 files were downloaded and hashes matched
(note: minimal build, so not all package downloads tested)

some links are offline but loaded from mirror
ftp downloads never worked, libxml2 and other random ftp downloads always fail with both methods
the only online ftp source libxml and libxslt got a http mirror
http://xmlsoft.org/sources/old/

one oddity http://www.ijg.org/files/jpegsrc.v8d.tar.gz would not load with both methods but is online and loads fine in the browser
Comment 6 j.nitschke 2016-01-09 13:20:45 UTC
*** Issue 123673 has been marked as a duplicate of this issue. ***
Comment 7 Andrea Pescetti 2016-01-09 17:44:38 UTC
Thank you! Patch applied in revision 1723866.