Bug 53814 - Could not display PDF file on Tomcat 7.0.27 above.
Could not display PDF file on Tomcat 7.0.27 above.
Status: RESOLVED INVALID
Product: Tomcat 7
Classification: Unclassified
Component: Catalina
7.0.27
HP Linux
: P2 normal (vote)
: ---
Assigned To: Tomcat Developers Mailing List
:
Depends on:
Blocks:
  Show dependency tree
 
Reported: 2012-09-02 12:38 UTC by manabu.shibata
Modified: 2012-11-03 20:58 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description manabu.shibata 2012-09-02 12:38:16 UTC
When I deploy PDF file on ROOT Application.
And I try to download the PDF file by access to direct URL(e.g.:http://localhost:8080/example.pdf).
Then browser open Adobe Acrobat Reader but the Reader shows error message and could not display the PDF file.

Tomcat doesn't record any error log.
Access log shows Acrobat Reader does partial request.

---
192.168.58.97 - - [02/Sep/2012:21:05:47 +0900] "GET /vsphere-esxi-vcenter-server-50-basics-guide.pdf HTTP/1.1" 206 4096
---

When I deploy same pdf file to Tomcat 7.0.26 running on same machine.
Then Acrobat Reader can display the file collectly.

Tomcat 7.0.28 and 7.0.29 has same problem.

----- Environment -----
Server: RHEL5.8
JDK:1.6.0u33
Tomcat:7.0.27/7.0.28/7.0.29

Client
OS: Windows7
Brownser: IE9
PDF Viewer:Adobe Acrobat Reader 9.5.0
Comment 1 Konstantin Kolinko 2012-09-02 14:50:45 UTC
There have been several threads on users@ regarding this,

[1] "PDF Download problem tomcat >= 7.0.27" started Jul 30, 2012
http://markmail.org/thread/lkr6touhymrxn4rg
http://marc.info/?t=134364331600004&r=1&w=2

[2] "Tomcat 7.x and Internet Explorer Adobe Reader plugin" started Aug 21, 2012
http://markmail.org/thread/loubsqje3ssqg7x7
http://marc.info/?t=134555927900004&r=1&w=2

Several notes:
--------------
1. Thus far nobody has shown any real data on what is wrong in Tomcat behaviour.

Without this, no real fix can be made.
I am changing the state of this issue to NEEDINFO.

2. According to [2] the problem appears only in IE, but not in Firefox or Chrome.
3. According to [1] the problem does not appear in Adobe Reader 10.

According to Abode site, Acrobat 9/Reader 9 are still supported, but their EOL is June 26, 2013. Maybe someone has to contact their support?

http://blogs.adobe.com/adobereader/2012/06/one-year-from-now-adobe-reader-and-acrobat-9-eol.html

4. A message that cites real request data (Jul 31)
http://markmail.org/message/3ylg5wdzmv4yd6fi
[[[
206KO:
GET /test.pdf HTTP/1.1
Accept: */*
Range: bytes=3446021-3447865, 475136-1792507
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727)
Host: xxxx:8080
Connection: Keep-Alive
Pragma: no-cache

HTTP/1.1 206 Partial Content
Server: Apache-Coyote/1.1
Accept-Ranges: bytes
ETag: W/"3447866-1343391729000"
Last-Modified: Fri, 27 Jul 2012 12:22:09 GMT
Content-Type: multipart/byteranges;boundary=CATALINA_MIME_BOUNDARY
Date: Tue, 31 Jul 2012 12:32:20 GMT
Content-Length: 1319458 
]]]

5. If the problem is in how range requests are served,
I can say the following:

- The components that serves the response here is org.apache.catalina.servlets.DefaultServlet.
Maybe someone could find a clue there.

- It is possible to disable support for range requests by setting its init parameter "useAcceptRanges" to the value of "false". See
http://tomcat.apache.org/tomcat-7.0-doc/default-servlet.html
Comment 2 Konstantin Preißer 2012-09-04 22:43:40 UTC
Hi,

I can reproduce this on a Windows 7 (SP1) x64 machine with Java 1.7.0_07 (64-Bit), Tomcat 7.0.30, using HTTP-APR (TC Native 1.1.24, 64-Bit), and with IE9 + Adobe Reader 9.5.0 (on the same machine). I used the PDF file from the OP in http://markmail.org/thread/lkr6touhymrxn4rg.

To record the data which is sent on TCP connections to Tomcat, I created a small Java program that redirects TCP connections and writes the exact data which is transmitted to 2 Filestreams (one for sending and one for receiving) per opened TCP connection.

I have uploaded a zip file which contains the original PDF file, and two folders which contains TCP stream data for Tomcat 7.0.30 and Tomcat 7.0.26. For each TCP connection, there are two file pairs, one with "In" and one with "Out".
It is available here: http://preisser.dynalias.org/dere1/Tomcat-PDF-Test.zip

What I can see is:
- With Tomcat 7.0.26, no problem occurs: IE does a request to the PDF file, and after some KB are transferred, the TCP connection is canceled. Then a new TCP connection is created, probably by Adobe Reader, which sends multiple requests with byte ranges (for the "Fast Web View").

- With Tomcat 7.0.30, the behavior is the same, except that the Adobe Reader sometimes seems to cancel one of the TCP connections and then displays a "Network Error" (because of that, the file "TCP-Data TC 7.0.30/Sock-1-In.txt" has only 8.00 KiB).

However, I don't know why Adobe Reader seems to think that there is an error, as I could not find anything wrong with Tomcat's response - the byte range response seems correct to me.

Maybe someone which has more detailed knowledge with the HTTP protocol can examine the files to see if there is anything going wrong...
Comment 3 Konstantin Kolinko 2012-09-05 04:19:41 UTC
One small difference is
7.0.26:
Content-Type: multipart/byteranges; boundary=CATALINA_MIME_BOUNDARY

7.0.30:
Content-Type: multipart/byteranges;boundary=CATALINA_MIME_BOUNDARY

There is no whitespace in ";boundary". (See bug 52811 for a cause of this change).

----------
Well, the grammar is (RFC 2616)

       media-type     = type "/" subtype *( ";" parameter )
       parameter               = attribute "=" value

or (RFC 2045)

  content := "Content-Type" ":" type "/" subtype
             *(";" parameter)
             ; Matching of media type and subtype
             ; is ALWAYS case-insensitive.
  parameter := attribute "=" value

so officially there is no need for a whitespace there.
If Adobe Reader indeed expects a '; ', then they are not following the specification.

I note, though, that many (if not all) examples in the specification have a whitespace before parameter. It seems that it would be more fool-proof to always include a whitespace there.

See o.a.tomcat.util.http.parser.AstMediaType#toString(), #toStringNoCharset()
s/sb.append(';');/sb.append("; ");/
Comment 4 Konstantin Preißer 2012-09-05 16:59:42 UTC
Hi Konstantin,

(In reply to comment #3)
> See o.a.tomcat.util.http.parser.AstMediaType#toString(), #toStringNoCharset()
> s/sb.append(';');/sb.append("; ");/

Thank you.
You are right: When I apply this change to the Tomcat 7.0.30 sources, then Adobe Reader loads the PDF in IE without any error.

So it seems that indeed the client (I guess IE if Adobe Reader uses its API to do HTTP requests, since IE's User-Agent is submitted and the problem does not occur on Firefox or Chrome) is expecting a whitespace after the ";".

Although it is not really a Tomcat bug since the absent whitespace is spec-compliant as you said, I also think a whitespace should be added to not exclude non-spec-compliant clients.
Comment 5 Mark Thomas 2012-09-05 17:11:35 UTC
This is a client bug, not a Tomcat one.

Generally, the Tomcat team does not make changes to work-around buggy clients.

Given which client is broken (Adobe Reader on IE) then the wide user base is potentially a reason to provide an optional work-around. That is probably best handled as a separate enhancement request with a better description rather than re-opening and editing this issue. Personally, I'd like to see some evidence that the bug had been reported to Adobe and they have refused to fix it before we even consider adding the work-around to Tomcat. Of course, there is always the "user a different browser" work-around.
Comment 6 Mark Thomas 2012-11-03 20:58:41 UTC
I ended up revisiting the HTTP header parser as a result of another bug. The parser for Content-Type has been replaced and the new version contains a work-around for this bug. The work-around will remain in place as long as the following are true:
- Adobe Acrobat Reader 9 is supported by Adobe
- Adobe has not fixed the bug in Acrobat Reader 9

Since the work-around is HTTP compliant and should not affect any other user agents, the work-around is hard-coded to enabled.