Summary: | Overflow in UnicodeString results in corrupted file when setCellValue() is called with a string larger than 32767 | ||
---|---|---|---|
Product: | POI | Reporter: | Mirjan Merruko <mmerruko> |
Component: | HSSF | Assignee: | POI Developers List <dev> |
Status: | NEW --- | ||
Severity: | normal | ||
Priority: | P2 | ||
Version: | 3.9-FINAL | ||
Target Milestone: | --- | ||
Hardware: | PC | ||
OS: | Linux | ||
Attachments: | Small JUnit test which demonstrates the problem with setCellValue(String) |
Description
Mirjan Merruko
2014-08-27 13:15:57 UTC
Are you able to create a short junit unit test which shows how to get round the check in HSSFRichTextString? (We need to decide if we should remove the cast, or add an additional check, the unit test showing how to trigger it should help with that) Created attachment 31951 [details]
Small JUnit test which demonstrates the problem with setCellValue(String)
The call to HSSFRichTextString(value) and UnicodeString(value) is done before setCellValue(), so we potentially cut off via the case in there before we do the actual check currently. It seems UnicodeString is used for multiple items listed in the spec. According to the spec under "2.5.294 XLUnicodeString" the length is specified as 2 bytes. Another usage is XLUnicodeRichExtendedString under "2.5.293 XLUnicodeRichExtendedString", this one allows continuation records, however the lenght-information still only allows 2 bytes. So it seems there is a limit of 65535 characters in the record-definitions. SpreadsheetVersion.EXCEL97, which we use to verify text-length in other places has 32767, not sure if this is somewhere in the spec or imposed because of other issues. At least formula-text seems to be limited to this value by the spec. |