Bug 51461

Summary: Infinite loop in IOUtils.readFully() when channel.read() returns 0 on specific files
Product: POI Reporter: Gabriele Columbro <gabriele>
Component: POIFSAssignee: POI Developers List <dev>
Status: RESOLVED DUPLICATE    
Severity: normal    
Priority: P2    
Version: 3.8-dev   
Target Milestone: ---   
Hardware: All   
OS: All   
Attachments: Examples of 66048 bytes XLS which generates and infinite loop
Example of XLS which does NOT generate infinite loop
Test case which generates the infinite loop with "broken.xls"
1 character patch which makes the test case pass with "broken.xls"
Example of 66048 bytes XLS which generates and infinite loop (clean)

Description Gabriele Columbro 2011-07-01 12:00:23 UTC
Created attachment 27238 [details]
Examples of 66048 bytes XLS which generates and infinite loop

In IOUtils.readFully() with certain specific file types / size (tested with Microsoft Excel) I had an infinite loop when basically channel.read() infinitely returns 0 (instead of -1 or the number of bytes read).

As per JDK 1.6 NIO documentation (http://download.oracle.com/javase/6/docs/api/java/nio/channels/ReadableByteChannel.html), it's "possible" that ReadableByteChannel.read returns 0, quoting:

int read()

Returns:
    The number of bytes read, possibly zero, or -1 if the channel has reached end-of-stream 


Now basically I got an infinite loop on all files which are funnily enough of the same file size (66048 bytes = 65536 content + 512 header), so I'm wondering whether this is related to that or to the specific file.

I'm attaching the XLS which generates the infinite loop (broken.xls) and a working one (working.xls).

I also created a test case mimicking the way I was reading the file (using Tika OfficeParser) which shows the issue (and the working case). In order to run the JUnit you'll need the two test files on the same folder you run it from.

I fine wrote a tentative patch (1 character), which passes the test case and temporarily solved the problem in the production instance I was working on, but I'm not fully sure it's the correct way to do it.

I had this on Win 2008 Server (virtualized with VMWare) and my local MacOSX 10.6
Comment 1 Gabriele Columbro 2011-07-01 12:01:19 UTC
Created attachment 27239 [details]
Example of XLS which does NOT generate infinite loop

Other file needed to run the test case I'm gonna attach
Comment 2 Gabriele Columbro 2011-07-01 12:02:27 UTC
Created attachment 27240 [details]
Test case which generates the infinite loop with "broken.xls"

This JUnit test case breaks with broken.xls and will work with working.xls using poi-3.8-beta3.

The JUnit passes when applying the patch I'm gonna attach.
Comment 3 Gabriele Columbro 2011-07-01 12:06:08 UTC
Created attachment 27241 [details]
1 character patch which makes the test case pass with "broken.xls"

The patch basically handles the case that 

channel.read()

returns 0 indefinitely.

It only adds a = to the < condition in the if.

Test case passes, but not sure why the issue generates (seems related to the file size being multiple of 512 but I tried with a simple 512 header + 512 content xls file and it actually worked without the patch, so a bit puzzled :)
Comment 4 Nick Burch 2011-07-01 12:06:58 UTC
Hmm, pesky. I had hit something like this before, but I thought I'd fixed it already (but checking when we reached the end of the block). It does seem to be something around reading to the end of the last block when it's an exact multiple.

What's the proposed fix?
Comment 5 Gabriele Columbro 2011-07-01 13:04:24 UTC
Created attachment 27242 [details]
Example of 66048 bytes XLS which generates and infinite loop (clean)
Comment 6 Tony Stevenson 2011-07-01 13:05:02 UTC
The content of attachment 27238 [details] has been deleted by
    Tony Stevenson <tony@pc-tony.com>
who provided the following reason:

Asked for by the user - 13:55:25 < mindthegab> pctony: can you try this link https://issues.apache.org/bugzilla/attachment.cgi?id=27238&action=delete ?

The token used to delete this attachment was generated at 2011-07-01 13:04:41 UTC.
Comment 7 Gabriele Columbro 2011-07-01 13:10:38 UTC
Thanks Tony!

For the test case purposes please check attachment 

https://issues.apache.org/bugzilla/attachment.cgi?id=27242


NOTE: 

You might need to remame it to broken.xls (or change the test case which expects this name), since now it was renamed by bugzilla to broken-1.xls
Comment 8 Nick Burch 2011-07-01 15:15:51 UTC
I can't reproduce the infinite loop using your supplied file - it works fine for me

I'm not sure we want to stop reading as soon as we hit zero bytes back from a read call though, as your patch suggests. Think of the case of reading from a slop stream, eg from a network. It's quite possible for that to return zero when there's no data yet, but more still to come. The contract for readFully is that it should keep going until there is no more data available, which we'd break for that situation

At the point your file hits an infinite loop, what are the values of:
 * total 
 * b.capacity() 
 * b.position()
?
Comment 9 Nick Burch 2011-07-18 19:52:02 UTC
There's a rumour that one of the machines at work might be showing up this JVM specific issue too, which'll hopefully let me get the details on what the state is when the loop is triggered if Gabriele isn't able to do so.

(Until we know what's going on on these JVM's when they loop, we can't identify if it's our fault or the JVM's, and try to fix/workaround as appropriate)
Comment 10 Torben Quast 2011-07-19 07:29:32 UTC
Hey Nick, 
I do have the same problem an I was able to reproduce it with the attached file.
The values you've requested are:
total = 65536
b.capacity = 66048
b.position = 66048
Let me know if you need more data.
Regards, 
Torben
Comment 11 Nick Burch 2011-07-19 11:29:07 UTC
(In reply to comment #10)
> The values you've requested are:
> total = 65536
> b.capacity = 66048
> b.position = 66048

That should be breaking out on

         if (total == b.capacity() || b.position() == b.capacity()) {

That was fixed as part of issue #51100 on 2011-04-21, so I think the issue is people using an older version of POI without this fix

Gabriele or Torben - can you confirm when the version of POI you're using dates from? (I fear it may be earlier in April than this)
Comment 12 Torben Quast 2011-07-19 12:13:16 UTC
Hey Nick, 
you are right. I'm using the version 20110401. I also had a peek at the svn but I only compared the files "manually" :-). Didn't see (without code highlighting) that there are two methods "readFully" and I only saw the line "if (total == len)" and thought that nothing had changed.
Thanks for the hint.
Regards, 
Torben
Comment 13 Nick Burch 2011-07-20 11:43:15 UTC
Looks like this is actually bug #51100, but everyone got confused and were sure the fix from that was present, when alas it wasn't...

*** This bug has been marked as a duplicate of bug 51100 ***