Reading an excel produced spreadsheet crashes with a NegativeArraySizeException to be thrown in processString. I've done some extensive delving to try and find the cause of this problem and hopefully fix it, what it boils down to is that there's a few incorrect assumptions made in bits of SSTRecord (specifically within manufactureStrings and processContinueRecord) which fail if one has a string with extended information that doesn't fit completely in the middle of a record (i.e. one which requires continue records). I have a nasty suspicion that this is a design issue which would require re-working the parser so that instead of trying to build an incomplete string up, it'll have to assemble an 'accumulated record' before string parsing, until the whole string and its other data can be fished out (or appropriately ignored). I can if necessary provide a spreadsheet which exhibits this problem, though it's pretty large. It may well be that a simple example is enough, though: Part way through a record there's a string which has a char_count of 44. This is a wide string, so its initial byte count is 91, and it also has the extended flag set, so that the total size expands up to 26719 bytes [Which is obviously too big to fit in a single buffer]. When manufactureStrings gets to this, it realizes it's going to cross records, but the subsequent 'partial string' logic completely fails to take into account both the additional length field BEFORE the string, and that some of the final size might not actually be part of the character data, so it ends up vastly overestimating how many characters the string would be. I can't see any way of fixing this without re-writing the continuation handling as described above (and that's a little too major of a change for me to attack at this moment, especially since I only downloaded HSSF yesterday! 8-))
great detective work. Yes this is a known problem. It means that SSTRecord is going to become even MORE complicated and painful than it already is. Glen is working on refactoring this I believe and Marc mentioned an interest in correcting the problem. *** This bug has been marked as a duplicate of 7655 ***
Oooh, yes nice job. As Andy suggested I'm looking at this area right now (although for another reason), I'll take this one. SST is a meaty little bastard. Do you have a testcase or spreadsheet that's causing this problem? That would be a big help.