Bug 62815 - Incorrect "0" value for largish integers in xlsb files
Summary: Incorrect "0" value for largish integers in xlsb files
Alias: None
Product: POI
Classification: Unclassified
Component: XSSF (show other bugs)
Version: unspecified
Hardware: PC All
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2018-10-10 19:57 UTC by Tim Allison
Modified: 2018-10-11 14:15 UTC (History)
0 users

triggering document (13.52 KB, application/vnd.ms-excel.sheet.binary.macroEnabled.12)
2018-10-10 19:57 UTC, Tim Allison

Note You need to log in before you can comment on or make changes to this bug.
Description Tim Allison 2018-10-10 19:57:53 UTC
Created attachment 36194 [details]
triggering document

On the user list, Dejan Ikodinovic noted that some large integer values are incorrectly extracted as "0" in xlsb.

I can reproduce this with the attached file, which, in Tika, yields:

<table><tbody><tr>      <td>1880000</td>        <td>10000000</td></tr>
<tr>    <td>0</td></tr>
<tr>    <td>0</td></tr>
<tr>    <td>0</td></tr>
<tr>    <td>1880004</td></tr>
<tr>    <td>0</td></tr>
<tr>    <td>0</td></tr>
<tr>    <td>0</td></tr>
<tr>    <td>1880008</td></tr>
<tr>    <td>0</td></tr>
<tr>    <td>0</td></tr>
<tr>    <td>0</td></tr>
<tr>    <td>1880012</td></tr>

I haven't figured out what the cause of this is.  It is possible that the problem is at the Tika level, but my guess is that I botched something at the POI level.

As a side note, if I save the file as xlsx, the numbers are extracted correctly.
Comment 1 Tim Allison 2018-10-11 14:15:09 UTC